Hi John,

I did not notice anything unusual in your env variables.

However, what are the batch interval, the windowsLength and SlindingWindow
interval.

Also how many messages are sent by Kafka in a typical batch interval?

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 10 June 2016 at 18:21, john.simon <john.si...@tapjoy.com> wrote:

> Hi all,
>
> I'm running Spark Streaming with Kafka Direct Stream, but after
> running a couple of days, the batch processing time almost doubles.
> I didn't find any slowdown on JVM GC logs, but I did find that Spark
> broadcast variable reading time increasing.
> Initially it takes less than 10ms, but after 3 days it takes more than
> 60ms. It's really puzzling since I don't use broadcast variables at
> all.
>
> My application needs to run 24/7, so I hope there's something I'm
> missing to correct this behavior.
>
> FYI, we're running on AWS EMR with Spark version 1.6.1, in YARN client
> mode.
> Attached spark application environment settings file.
>
> --
> John Simon
>
> *environment.txt* (7K) Download Attachment
> <http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27138/0/environment.txt>
>
> ------------------------------
> View this message in context: Long Running Spark Streaming getting slower
> <http://apache-spark-user-list.1001560.n3.nabble.com/Long-Running-Spark-Streaming-getting-slower-tp27138.html>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>

Reply via email to