Re: persistent state for spark streaming

2014-10-02 Thread Yana Kadiyska
Yes -- persist is more akin to caching -- it's telling Spark to materialize that RDD for fast reuse but it's not meant for the end user to query/use across processes, etc.(at least that's my understanding). On Thu, Oct 2, 2014 at 4:04 AM, Chia-Chun Shih wrote: > Hi Yana, > > So, user quotas need

Re: persistent state for spark streaming

2014-10-02 Thread Chia-Chun Shih
Hi Yana, So, user quotas need another data store, which can guarantee persistence and afford frequent data updates/access. Is it correct? Thanks, Chia-Chun 2014-10-01 21:48 GMT+08:00 Yana Kadiyska : > I don't think persist is meant for end-user usage. You might want to call > saveAsTextFiles, f

Re: persistent state for spark streaming

2014-10-01 Thread Yana Kadiyska
I don't think persist is meant for end-user usage. You might want to call saveAsTextFiles, for example, if you're saving to the file system as strings. You can also dump the DStream to a DB -- there are samples on this list (you'd have to do a combo of foreachRDD and mapPartitions, likely) On Wed,