< Does it work as expected with smaller batch or smaller load? Could it be that 
it's accumulating too many events over 3 minutes?

Thanks for you input.  The 3 minute window was chosen because we write the 
output of each batch into S3.  And with smaller batch time intervals there were 
many small files being written to S3, something to avoid.  That was the 
explanation of the developer who made this decision (who's no longer on the 
team).   We're in the process of re-evaluating.
--
     Nick

-----Original Message-----
From: Adrian Tanase [mailto:atan...@adobe.com]
Sent: Wednesday, October 28, 2015 4:53 PM
To: Afshartous, Nick <nafshart...@turbine.com>
Cc: user@spark.apache.org
Subject: Re: Spark/Kafka Streaming Job Gets Stuck

Does it work as expected with smaller batch or smaller load? Could it be that 
it's accumulating too many events over 3 minutes?

You could also try increasing the parallelism via repartition to ensure smaller 
tasks that can safely fit in working memory.

Sent from my iPhone

> On 28 Oct 2015, at 17:45, Afshartous, Nick <nafshart...@turbine.com> wrote:
>
>
> Hi, we are load testing our Spark 1.3 streaming (reading from Kafka)  job and 
> seeing a problem.  This is running in AWS/Yarn and the streaming batch 
> interval is set to 3 minutes and this is a ten node cluster.
>
> Testing at 30,000 events per second we are seeing the streaming job get stuck 
> (stack trace below) for over an hour.
>
> Thanks on any insights or suggestions.
> --
>      Nick
>
> org.apache.spark.streaming.api.java.AbstractJavaDStreamLike.mapPartiti
> onsToPair(JavaDStreamLike.scala:43)
> com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsum
> erDriver.runStream(StreamingKafkaConsumerDriver.java:125)
> com.wb.analytics.spark.services.streaming.drivers.StreamingKafkaConsum
> erDriver.main(StreamingKafkaConsumerDriver.java:71)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:57)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:606)
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(Application
> Master.scala:480)
>
> Notice: This communication is for the intended recipient(s) only and may 
> contain confidential, proprietary, legally protected or privileged 
> information of Turbine, Inc. If you are not the intended recipient(s), please 
> notify the sender at once and delete this communication. Unauthorized use of 
> the information in this communication is strictly prohibited and may be 
> unlawful. For those recipients under contract with Turbine, Inc., the 
> information in this communication is subject to the terms and conditions of 
> any applicable contracts or agreements.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For
> additional commands, e-mail: user-h...@spark.apache.org
>

Notice: This communication is for the intended recipient(s) only and may 
contain confidential, proprietary, legally protected or privileged information 
of Turbine, Inc. If you are not the intended recipient(s), please notify the 
sender at once and delete this communication. Unauthorized use of the 
information in this communication is strictly prohibited and may be unlawful. 
For those recipients under contract with Turbine, Inc., the information in this 
communication is subject to the terms and conditions of any applicable 
contracts or agreements.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to