Re: Timeout errors from Akka in Spark 1.2.1

2015-04-16 Thread N B
Hi Guillaume, Interesting that you brought up Shuffle. In fact we are experiencing this issue of shuffle files being left behind and not being cleaned up. Since this is a Spark streaming application, it is expected to stay up indefinitely, so shuffle files being left is a big problem right now.

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread Tathagata Das
There are a couple of options. Increase timeout (see Spark configuration). Also see past mails in the mailing list. Another option you may try (I have gut feeling that may work, but I am not sure) is calling GC on the driver periodically. The cleaning up of stuff is tied to GCing of RDD objects

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread N B
Since we are running in local mode, won't all the executors be in the same JVM as the driver? Thanks NB On Wed, Apr 8, 2015 at 1:29 PM, Tathagata Das t...@databricks.com wrote: Its does take effect on the executors, not on the driver. Which is okay because executors have all the data and

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread N B
Thanks TD. I believe that might have been the issue. Will try for a few days after passing in the GC option on the java command line when we start the process. Thanks for your timely help. NB On Wed, Apr 8, 2015 at 6:08 PM, Tathagata Das t...@databricks.com wrote: Yes, in local mode they the

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread Tathagata Das
Yes, in local mode they the driver and executor will be same the process. And in that case the Java options in SparkConf configuration will not work. On Wed, Apr 8, 2015 at 1:44 PM, N B nb.nos...@gmail.com wrote: Since we are running in local mode, won't all the executors be in the same JVM

Timeout errors from Akka in Spark 1.2.1

2015-04-07 Thread Nikunj Bansal
I have a standalone and local Spark streaming process where we are reading inputs using FlumeUtils. Our longest window size is 6 hours. After about a day and a half of running without any issues, we start seeing Timeout errors while cleaning up input blocks. This seems to cause reading from Flume