Hi John, I did not notice anything unusual in your env variables.
However, what are the batch interval, the windowsLength and SlindingWindow interval. Also how many messages are sent by Kafka in a typical batch interval? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 10 June 2016 at 18:21, john.simon <john.si...@tapjoy.com> wrote: > Hi all, > > I'm running Spark Streaming with Kafka Direct Stream, but after > running a couple of days, the batch processing time almost doubles. > I didn't find any slowdown on JVM GC logs, but I did find that Spark > broadcast variable reading time increasing. > Initially it takes less than 10ms, but after 3 days it takes more than > 60ms. It's really puzzling since I don't use broadcast variables at > all. > > My application needs to run 24/7, so I hope there's something I'm > missing to correct this behavior. > > FYI, we're running on AWS EMR with Spark version 1.6.1, in YARN client > mode. > Attached spark application environment settings file. > > -- > John Simon > > *environment.txt* (7K) Download Attachment > <http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27138/0/environment.txt> > > ------------------------------ > View this message in context: Long Running Spark Streaming getting slower > <http://apache-spark-user-list.1001560.n3.nabble.com/Long-Running-Spark-Streaming-getting-slower-tp27138.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >