Any pointers guys? On Tue, Nov 25, 2014 at 5:32 PM, Mukesh Jha <me.mukesh....@gmail.com> wrote:
> Hey Experts, > > I wanted to understand in detail about the lifecycle of rdd(s) in a > streaming app. > > From my current understanding > - rdd gets created out of the realtime input stream. > - Transform(s) functions are applied in a lazy fashion on the RDD to > transform into another rdd(s). > - Actions are taken on the final transformed rdds to get the data out of > the system. > > Also rdd(s) are stored in the clusters RAM (disc if configured so) and are > cleaned in LRU fashion. > > So I have the following questions on the same. > - How spark (streaming) guarantees that all the actions are taken on each > input rdd/batch. > - How does spark determines that the life-cycle of a rdd is complete. Is > there any chance that a RDD will be cleaned out of ram before all actions > are taken on them? > > Thanks in advance for all your help. Also, I'm relatively new to scala & > spark so pardon me in case these are naive questions/assumptions. > > -- > Thanks & Regards, > > *Mukesh Jha <me.mukesh....@gmail.com>* > -- Thanks & Regards, *Mukesh Jha <me.mukesh....@gmail.com>*