[
https://issues.apache.org/jira/browse/IGNITE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289162#comment-16289162
]
Nikolay Izhikov commented on IGNITE-3084:
-----------------------------------------
> add {{CONFIG}} to allow providing {{IgniteConfiguration}} object.
We can't do it because params are {{Map[String,String]}}, no objects are
allowed.
> I think we should leave only {{IGNITE}}, {{CONFIG_FILE}}, {{TABLE}} options
> .... {{GRID}}, {{TCP_IP_ADDRESSES}} and {{PEER_CLASS_LOADING}} should be
> removed.
Done.
Please, note, some details about usage of {{CONFIG_FILE}} option.
For now it is required to have a configuration file on *each sparker worker
node on the same path*(Please, see {{IgniteContext}} and {{cfgF}} closure.)
We can't just load configuration from the master node to worker nodes because
IgniteConfigration is not serializable.
I think, Require to copy some files to local filesystem of every cluster node
is a huge disadvantage.
Does it make sense for you?
Have I missed something?
I tried to provide a simpler approach to connect to existing Ignite cluster
from Spark worker node.
And came up with {{TCP_IP_ADDRESSES}} parameter.
With it we can just add Ignite jars to the task classpath via
{{sparkContext.addJar}} and use every existing Spark cluster to execute jobs
with Ignite DataFrames.
Please, see my example:
https://github.com/nizhikov/ignite-spark-df-example/blob/master/src/main/scala/org/apache/ignite/scalar/examples/spark/StandaloneClustersExample.scala
> Spark Data Frames Support in Apache Ignite
> ------------------------------------------
>
> Key: IGNITE-3084
> URL: https://issues.apache.org/jira/browse/IGNITE-3084
> Project: Ignite
> Issue Type: Task
> Components: spark
> Affects Versions: 1.5.0.final
> Reporter: Vladimir Ozerov
> Assignee: Nikolay Izhikov
> Priority: Critical
> Labels: bigdata, important
> Fix For: 2.4
>
>
> Apache Spark already benefits from integration with Apache Ignite. The latter
> provides shared RDDs, an implementation of Spark RDD, that help Spark to
> share a state between Spark workers and execute SQL queries much faster. The
> next logical step is to enable support for modern Spark Data Frames API in a
> similar way.
> As a contributor, you will be fully in charge of the integration of Spark
> Data Frame API and Apache Ignite.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)