I have a couple of questions with about the ignite spark integration
1. What consistency guarantees does ignite provide when saving a RDD to the
data grid? For e.g. Assuming the spark rdd holds 1 million records and I
call sharedRdd.savePairs() api.
a. What happens if the spark worker crashes after few thousand records are
already saved? Would the data uploaded to the data grid before the crash be
rolled back?
b. What happens if one of the ignite server nodes that is loading some of
the data crashes? Would the data on other nodes in the data grid for this
rdd be rolled back?
2. When updating data into the data grid through the rdd api, can ignite
smartly determine what data is updated (comparing with the previous data in
the data grid) and update only those partitions? Assuming the rdd was
previously loaded from the data grid through
igniteContext.fromCache("partitioned") api.
Thanks.
--
View this message in context:
http://apache-ignite-users.70518.x6.nabble.com/Consistency-Guarantees-Smart-Updates-with-Spark-Integration-tp10091.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.