itsvikramagr commented on issue #24922: [SPARK-28120][SS] Rocksdb state storage implementation URL: https://github.com/apache/spark/pull/24922#issuecomment-505062488 Thanks @gaborgsomogyi - Will fix the style problem asap and update the PR - In my test setup, I was able to scale to more than 250 million keys using just 2 i3.xlarge executor nodes by running a group by aggregation query on campaign data source generated using rate source. I stopped my experiment after 5 hours. GC time was about 1.5% of the total task time (see attached). In the same setup, default implementation crashed after creating 35 million new state keys - I ran my experiments with varying load and under different stress condition. Please recommend more scenarios which you think I should be testing. <img width="1435" alt="executor-memory-usage" src="https://user-images.githubusercontent.com/5220941/60031825-0baa2580-96c3-11e9-83aa-8e01311f5530.png"> <img width="708" alt="state-store-rows" src="https://user-images.githubusercontent.com/5220941/60032007-59269280-96c3-11e9-97ed-65dcc3323870.png">
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
