Hi. I know that every RDD received in a DStream are replicated to 2 nodes by default. However if i choose a big batchDuration (let's say 5 min), data that is received in the stream is also reliably stored? How? As far as I know are the RDDs the ones that stored reliably (once the RDD has it's complete data from the batchDuration).
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/data-within-batchduration-in-RDD-of-a-Dstream-reliable-tp835.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
