Microbatch is 20 seconds. We’re not using window operations. The graphs are for a test cluster, and the entire load is artificially generated by load tests (100k / 200k generated sessions).
We’ve performed a few more performance tests. On the same 5 node cluster, with the same application: · Spark 1.5.1 handled 170k+ generated sessions for 24hours with no scheduling delay – the limit seems to be around 180k, above which scheduling delay starts to increase; · Spark 1.6.1 had constant upward-trending scheduling delay from the beginning for 100k+ generated sessions (this is also mentioned in the initial post) – the load test was stopped after 25 minutes as scheduling delay reached 3,5 minutes. P.S. Florin and I will be in SF next week, attending the Spark Summit on Tuesday and Wednesday. We can meet and go into more details there - is anyone working on Spark Streaming available? Cosmin From: Mich Talebzadeh <mich.talebza...@gmail.com> Date: Saturday 4 June 2016 at 12:33 To: Florin Broască <florin.broa...@gmail.com> Cc: David Newberger <david.newber...@wandcorp.com>, Adrian Tanase <atan...@adobe.com>, "user@spark.apache.org" <user@spark.apache.org>, ciobanu <ciob...@adobe.com> Subject: Re: [REPOST] Severe Spark Streaming performance degradation after upgrading to 1.6.1 batch interval I meant thx Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 4 June 2016 at 10:32, Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: I may have missed these but: What is the windows interval, windowsLength and SlidingWindow Has the volume of ingest data (Kafka streaming) changed recently that you may not be aware of? HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 4 June 2016 at 09:50, Florin Broască <florin.broa...@gmail.com<mailto:florin.broa...@gmail.com>> wrote: Hi David, Thanks for looking into this. This is how the processing time looks like: [nline image 1] Appreciate any input, Florin On Fri, Jun 3, 2016 at 3:22 PM, David Newberger <david.newber...@wandcorp.com<mailto:david.newber...@wandcorp.com>> wrote: What does your processing time look like. Is it consistently within that 20sec micro batch window? David Newberger From: Adrian Tanase [mailto:atan...@adobe.com<mailto:atan...@adobe.com>] Sent: Friday, June 3, 2016 8:14 AM To: user@spark.apache.org<mailto:user@spark.apache.org> Cc: Cosmin Ciobanu Subject: [REPOST] Severe Spark Streaming performance degradation after upgrading to 1.6.1 Hi all, Trying to repost this question from a colleague on my team, somehow his subscription is not active: http://apache-spark-user-list.1001560.n3.nabble.com/Severe-Spark-Streaming-performance-degradation-after-upgrading-to-1-6-1-td27056.html Appreciate any thoughts, -adrian