Re: Timeout errors from Akka in Spark 1.2.1

N B Wed, 08 Apr 2015 18:53:12 -0700

Thanks TD. I believe that might have been the issue. Will try for a few
days after passing in the GC option on the java command line when we start
the process.


Thanks for your timely help.
NB

On Wed, Apr 8, 2015 at 6:08 PM, Tathagata Das <t...@databricks.com> wrote:

> Yes, in local mode they the driver and executor will be same the process.
> And in that case the Java options in  SparkConf configuration will not
> work.
>
> On Wed, Apr 8, 2015 at 1:44 PM, N B <nb.nos...@gmail.com> wrote:
>
>> Since we are running in local mode, won't all the executors be in the
>> same JVM as the driver?
>>
>> Thanks
>> NB
>>
>> On Wed, Apr 8, 2015 at 1:29 PM, Tathagata Das <t...@databricks.com>
>> wrote:
>>
>>> Its does take effect on the executors, not on the driver. Which is okay
>>> because executors have all the data and therefore have GC issues, not so
>>> usually for the driver. If you want to double-sure, print the JVM flag
>>> (e.g. http://stackoverflow.com/questions/10486375/print-all-jvm-flags)
>>>
>>> However, the GC i was referring to that initiates the RDD and shuffle
>>> cleanup was the GC on the driver. Thought I would clarify.
>>>
>>> TD
>>>
>>> On Wed, Apr 8, 2015 at 1:23 PM, N B <nb.nos...@gmail.com> wrote:
>>>
>>>> Hi TD,
>>>>
>>>> Thanks for the response. Since you mentioned GC, this got me thinking.
>>>>
>>>> Given that we are running in local mode (all in a single JVM) for now,
>>>> does the option "spark.executor.extraJavaOptions" set to
>>>> "-XX:+UseConcMarkSweepGC" inside SparkConf object take effect at all before
>>>> we use it to create the StreamingContext? I ask because that is what we are
>>>> doing right now. If not, perhaps we have not been running with the
>>>> Concurrent Mark Sweep at all and is that recommended instead of forcing GC
>>>> periodically?
>>>>
>>>> Thanks
>>>> NB
>>>>
>>>>
>>>> On Wed, Apr 8, 2015 at 10:20 AM, Tathagata Das <t...@databricks.com>
>>>> wrote:
>>>>
>>>>> There are a couple of options. Increase timeout (see Spark
>>>>> configuration).
>>>>>
>>>>> Also see past mails in the mailing list.
>>>>>
>>>>> Another option you may try (I have gut feeling that may work, but I am
>>>>> not sure) is calling GC on the driver periodically. The cleaning up of
>>>>> stuff is tied to GCing of RDD objects and regular cleaning may help keep
>>>>> things clean more rigorously rather than in unpredictable bursts of GC
>>>>> activity.
>>>>>
>>>>> Let us know how it works out.
>>>>>
>>>>> TD
>>>>>
>>>>> On Tue, Apr 7, 2015 at 6:00 PM, Nikunj Bansal <nb.nos...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I have a standalone and local Spark streaming process where we are
>>>>>> reading inputs using FlumeUtils. Our longest window size is 6 hours. 
>>>>>> After
>>>>>> about a day and a half of running without any issues, we start seeing
>>>>>> Timeout errors while cleaning up input blocks. This seems to cause 
>>>>>> reading
>>>>>> from Flume to cease.
>>>>>>
>>>>>>
>>>>>> ERROR sparkDriver-akka.actor.default-dispatcher-78
>>>>>> BlockManagerSlaveActor.logError - Error in removing block
>>>>>> input-0-1428182594000
>>>>>> org.apache.spark.SparkException: Error sending message [message =
>>>>>> UpdateBlockInfo(BlockManagerId(<driver>, localhost,
>>>>>> 55067),input-0-1428182594000,StorageLevel(false, false, false, false,
>>>>>> 1),0,0,0)]
>>>>>> at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:201)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:221)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerMaster.updateBlockInfo(BlockManagerMaster.scala:62)
>>>>>> at org.apache.spark.storage.BlockManager.org
>>>>>> $apache$spark$storage$BlockManager$$tryToReportBlockStatus(BlockManager.scala:385)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManager.reportBlockStatus(BlockManager.scala:361)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManager.removeBlock(BlockManager.scala:1105)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerSlaveActor$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$1.apply$mcZ$sp(BlockManagerSlaveActor.scala:44)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerSlaveActor$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$1.apply(BlockManagerSlaveActor.scala:43)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerSlaveActor$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$1.apply(BlockManagerSlaveActor.scala:43)
>>>>>> at
>>>>>> org.apache.spark.storage.BlockManagerSlaveActor$$anonfun$1.apply(BlockManagerSlaveActor.scala:76)
>>>>>> at
>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>>>>>> at
>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>>>>>> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>>>>>> at
>>>>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
>>>>>> at
>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>> at
>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>> at
>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>> at
>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>> Caused by: java.util.concurrent.TimeoutException: Futures timed out
>>>>>> after [30 seconds]
>>>>>> at
>>>>>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>>>>> at
>>>>>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>>>>> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>>>>> at
>>>>>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread$$anon$3.block(ThreadPoolBuilder.scala:169)
>>>>>> at
>>>>>> scala.concurrent.forkjoin.ForkJoinPool.managedBlock(ForkJoinPool.java:3640)
>>>>>> at
>>>>>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread.blockOn(ThreadPoolBuilder.scala:167)
>>>>>> at scala.concurrent.Await$.result(package.scala:107)
>>>>>> at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)
>>>>>> ... 17 more
>>>>>>
>>>>>> There was a similar query posted here
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Block-removal-causes-Akka-timeouts-td15632.html
>>>>>> but did not find any resolution to that issue.
>>>>>>
>>>>>>
>>>>>> Thanks in advance,
>>>>>> NB
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Timeout errors from Akka in Spark 1.2.1

Reply via email to