Hello, guys.

Currently, I’m working on integration between Spark and Ignite [1].

For now, I implement following:
    * Ignite DataSource implementation(IgniteRelationProvider)
    * DataFrame support for Ignite SQL table.
    * IgniteCatalog implementation for a transparent resolving of ignites
SQL tables.

Implementation of it can be found in PR [2]
It would be great if someone provides feedback for a prototype.

I made some examples in PR so you can see how API suppose to be used [3].
[4].

I need some advice. Can you help me?

1. How should this PR be tested?

Of course, I need to provide some unit tests. But what about scalability
tests, etc.
Maybe we need some Yardstick benchmark or similar?
What are your thoughts?
Which scenarios should I consider in the first place?

2. Should we provide Spark Catalog implementation inside Ignite codebase?

A current implementation of Spark Catalog based on *internal Spark API*.
Spark community seems not interested in making Catalog API public or
including Ignite Catalog in Spark code base [5], [6].

*Should we include Spark internal API implementation inside Ignite code
base?*

Or should we consider to include Catalog implementation in some external
module?
That will be created and released outside Ignite?(we still can support and
develop it inside Ignite community).

[1] https://issues.apache.org/jira/browse/IGNITE-3084
[2] https://github.com/apache/ignite/pull/2742
[3] https://github.com/apache/ignite/pull/2742/files#diff-
f4ff509cef3018e221394474775e0905
[4] https://github.com/apache/ignite/pull/2742/files#diff-
f2b670497d81e780dfd5098c5dd8a89c
[5] http://apache-spark-developers-list.1001551.n3.
nabble.com/Spark-Core-Custom-Catalog-Integration-between-
Apache-Ignite-and-Apache-Spark-td22452.html
[6] https://issues.apache.org/jira/browse/SPARK-17767

--
Nikolay Izhikov
nizhikov....@gmail.com

Reply via email to