Hello Spark Streaming Experts I have a use-case, where I have a bunch of log-entries coming in, say every 10 seconds (Batch-interval). I create a JavaPairDStream[K,V] from these log-entries. Now, there are two things I want to do with this JavaPairDStream:
1. Use key-dependent state (updated by updateStateByKey) to apply a transformation function on the JavaPairDStream[K, V]. I know that we get a JavaPairDStream[K, S] as return value of updateStateByKey. However, I can't possibly pass a JavaPairDStream to a transformation function, nor can I convert JavaPairDStream[K,S] to let's say a HashMap<K,S> (Or is there a way to do this?). Even if I could convert it to a HashMap<K,S>, could I really pass it to a transformation function, since this HashMap<K,S> changes after every batch computation? 2. Update key-dependent state using Iterable<V>: This should be easily doable using updateStateByKey. In a nutshell, how do I access the very state updated by updateStateByKey for applying let's say a map function on the JavaPairDStream[K,V]. Note that I am not using any sliding windows at all. Just plain batches. Thanks Gaurav -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Accessing-the-per-key-state-maintained-by-updateStateByKey-for-transformation-of-JavaPairDStream-tp7680.html Sent from the Apache Spark User List mailing list archive at Nabble.com.