pragmaticbigdata wrote
> Yes but the data would not be in sync when both(updates and analytics) are
> done concurrently, right? I will have to discard the spark
> rdd/dataset/dataframe every time the data is updated in ignite through the
> Ignite API. As I understand the data remains in sync only when we use the
> IgniteRDD api. Correct me if my understanding is wrong.

Data will be in sync because it's stored in Ignite cache. IgniteRDD uses
Ignite API to update it and you can do this as well in your code.

pragmaticbigdata wrote
> I have an additional question on the same topic - Even when ignite runs in
> an embedded mode with spark, the memory footprint behavior is the same as
> it is when ignite runs in standalone mode, right? i.e When spark  fetches
> the ignite cache through the IgniteRDD api (val igniteRDD =
> igniteContext.fromCache("
> <cache-name>
> ") a copy of data is created in the spark worker's memory.

There is no copy of the data maintained in Spark, it's always stored in
Ignite caches. Spark runs Ignite client(s) that can fetch the data for
computation, but it doesn't store it.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Apache-Spark-Ignite-Integration-tp8556p9114.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to