OK. Thanks a lot TD.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Does-RDD-checkpointing-store-the-entire-state-in-HDFS-tp7368p13231.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---
do I have
to use any code like ssc.checkpoint(checkpointDir)? Also, how is the
performance if I use both DStream Checkpointing for maintaining the state
and use Kafka Direct approach for exactly once semantics?
Thanks,
Swetha
--
View this message in context:
http://apache-spark-developers
,
Swetha
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-sessionization-with-updateStateByKey-tp13226.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com
Hi,
Suppose I want the data to be grouped by and Id named "12345" and I have
certain amount of data coming out from one batch for "12345" and I have data
related to "12345" coming after 5 hours, how do I group by "12345" and have
a single RDD of list?
T
Hi,
What happens if a master node fails in the case of Spark Streaming? Would
the data be lost in that case?
Thanks,
Swetha
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-master-node-failure-tp13055.html
Sent from the Apache Spark