Thanks for confirming! On Wed, Apr 1, 2015 at 12:33 PM, Tathagata Das <[email protected]> wrote:
> In the current state yes there will be performance issues. It can be done > much more efficiently and we are working on doing that. > > TD > > On Wed, Apr 1, 2015 at 7:49 AM, Vinoth Chandar <[email protected]> wrote: > >> Hi all, >> >> As I understand from docs and talks, the streaming state is in memory as >> RDD (optionally checkpointable to disk). SPARK-2629 hints that this in >> memory structure is not indexed efficiently? >> >> I am wondering how my performance would be if the streaming state does >> not fit in memory (say 100GB state over 10GB total RAM), and I did random >> updates to different keys via updateStateByKey? (Would throwing in SSDs >> help out). >> >> I am picturing some kind of performance degeneration would happen akin to >> Linux/innoDB Buffer cache thrashing. But if someone can demystify this, >> that would be awesome. >> >> Thanks >> Vinoth >> >> >
