subject:"\"RE\\\: spark streaming with checkpoint\""

Re: spark streaming with checkpoint

2015-01-25 Thread Balakrishnan Narendran

- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: spark streaming with checkpoint

2015-01-25 Thread Balakrishnan Narendran

Yeah use streaming to gather the incoming logs and write to log file then run a spark job evry 5 minutes to process the counts. Got it. Thanks a lot. On 07:07, Mon, 26 Jan 2015 Tobias Pfeiffer wrote: > Hi, > > On Tue, Jan 20, 2015 at 8:16 PM, balu.naren wrote: > >> I am a beginner to spark

Re: spark streaming with checkpoint

2015-01-25 Thread Tobias Pfeiffer

Hi, On Tue, Jan 20, 2015 at 8:16 PM, balu.naren wrote: > I am a beginner to spark streaming. So have a basic doubt regarding > checkpoints. My use case is to calculate the no of unique users by day. I > am using reduce by key and window for this. Where my window duration is 24 > hours and slide

RE: spark streaming with checkpoint

2015-01-22 Thread Shao, Saisai

streaming with checkpoint Thank you Jerry, Does the window operation create new RDDs for each slide duration..? I am asking this because i see a constant increase in memory even when there is no logs received. If not checkpoint is there any alternative that you would suggest.? On Tue, Jan 20

Re: spark streaming with checkpoint

2015-01-22 Thread Jörn Franke

Maybe you use a wrong approach - try something like hyperloglog or bitmap structures as you can find them, for instance, in redis. They are much smaller Le 22 janv. 2015 17:19, "Balakrishnan Narendran" a écrit : > Thank you Jerry, >Does the window operation create new RDDs for each slide

Re: spark streaming with checkpoint

2015-01-22 Thread Balakrishnan Narendran

Thank you Jerry, Does the window operation create new RDDs for each slide duration..? I am asking this because i see a constant increase in memory even when there is no logs received. If not checkpoint is there any alternative that you would suggest.? On Tue, Jan 20, 2015 at 7:08 PM, Shao,

RE: spark streaming with checkpoint

2015-01-20 Thread Shao, Saisai

Hi, Seems you have such a large window (24 hours), so the phenomena of memory increasing is expectable, because of window operation will cache the RDD within this window in memory. So for your requirement, memory should be enough to hold the data of 24 hours. I don't think checkpoint in Spark

Re: spark streaming with checkpoint

Re: spark streaming with checkpoint

Re: spark streaming with checkpoint

RE: spark streaming with checkpoint

Re: spark streaming with checkpoint

Re: spark streaming with checkpoint

RE: spark streaming with checkpoint

7 matches

Site Navigation

Mail list logo

Footer information