It should be noted I'm a newbie to Spark so please have patience ... I'm trying to convert an existing application over to spark and am running into some "high level " questions that I can't seem to resolve. Possibly because what I'm trying to do is not supported.
In a nutshell as I process the individual elements of an rdd I want to save away some calculations etc that for all intensive purposes, the results fit nicely into a hashmap structure . I'd like to than take that hashmap and somehow get access to it, so I can use and update it as I process element 2 . and than naturally I want it available for the remaining elements in an rdd and even across RDD's . In this example I mention Hashmap but it could be any arbitrary object . So from a broad sense I'm looking at maintaining some state across each element of a JavaDStream. ( The state information can get large but I will be partitioning the dstream by hashing on a key ... I don't think however this is relevant to the question being asked ... ) I'd like to do this while I'm transforming an RDD into another RDD as part of a JavaDStream transform or map type operation.. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Maintaining-state-tp22424.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org