In the current state yes there will be performance issues. It can be done much more efficiently and we are working on doing that.
TD On Wed, Apr 1, 2015 at 7:49 AM, Vinoth Chandar <vin...@uber.com> wrote: > Hi all, > > As I understand from docs and talks, the streaming state is in memory as > RDD (optionally checkpointable to disk). SPARK-2629 hints that this in > memory structure is not indexed efficiently? > > I am wondering how my performance would be if the streaming state does not > fit in memory (say 100GB state over 10GB total RAM), and I did random > updates to different keys via updateStateByKey? (Would throwing in SSDs > help out). > > I am picturing some kind of performance degeneration would happen akin to > Linux/innoDB Buffer cache thrashing. But if someone can demystify this, > that would be awesome. > > Thanks > Vinoth > >