[
https://issues.apache.org/jira/browse/IGNITE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304288#comment-16304288
]
Nikolay Izhikov commented on IGNITE-3084:
-----------------------------------------
{quote}onApplicationEnd method - makes sense. But it sounds like it should be
on IgniteContext level, what do you think?{quote}
OK. Listener moved to IgniteContext.
I think of IgniteContext as part of public API so I don't want to change its
behavior. That’s the reason I don’t implement listener inside IgniteContext in
previous revisions.
{quote}IgniteCacheRelation - let's remove it for now and discuss on dev@ as a
separate task.{quote}
Done.
{quote}Also let's rename IgniteDataFrameOptions to IgniteDataFrameSettings, and
inside it:{quote}
Done.
{quote}Remove GRID option for now. It's a bit confusing in the current
implementation and I'm not sure how to make it more usable. We can always come
back to this in future if needed.{quote}
Can't do it, because, {{OPTION_GRID}} used internally by catalog for now.
When spark resolves existing tables for Ignite we need to specify how table is
stored and properties to access it. Properties is stored in {{Map\[String,
String\]}}.
Does it make sense for you? Can we replace {{gridName}} with something more
appropriate?
Please, look inside the code for more details:
IgniteExternalCatalog#getTableOption line 111.
{code:scala}
storage = CatalogStorageFormat(
locationUri = None,
inputFormat = Some(FORMAT_IGNITE),
outputFormat = Some(FORMAT_IGNITE),
serde = None,
compressed = false,
properties = Map(
OPTION_GRID → gridName,
OPTION_TABLE → tableName)
),
{code}
{quote}Can we move {{IgniteSparkSession}} to org.apache.ignite.spark{quote}
No, we can’t. Because many of methods used inside {{IgniteSparkSesion}} are
package private for a {{org.apache.spark.sql}}. For example:
* SQLContext constructor \[1\]: IgniteSparkSession#63
* SharedState class \[2\]: IgniteSparkSession#66
* Dataset object \[3\]: IgniteSparkSession#103
* Etc…
\[1\]
https://github.com/apache/spark/blob/v2.2.0/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L58
\[2\]
https://github.com/apache/spark/blob/v2.2.0/sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala#L42
\[3\]
https://github.com/apache/spark/blob/v2.2.0/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L59
> Spark Data Frames Support in Apache Ignite
> ------------------------------------------
>
> Key: IGNITE-3084
> URL: https://issues.apache.org/jira/browse/IGNITE-3084
> Project: Ignite
> Issue Type: Task
> Components: spark
> Affects Versions: 1.5.0.final
> Reporter: Vladimir Ozerov
> Assignee: Nikolay Izhikov
> Priority: Critical
> Labels: bigdata, important
> Fix For: 2.4
>
>
> Apache Spark already benefits from integration with Apache Ignite. The latter
> provides shared RDDs, an implementation of Spark RDD, that help Spark to
> share a state between Spark workers and execute SQL queries much faster. The
> next logical step is to enable support for modern Spark Data Frames API in a
> similar way.
> As a contributor, you will be fully in charge of the integration of Spark
> Data Frame API and Apache Ignite.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)