RE: Handling worker batch processing during driver shutdown

2015-03-13 Thread Tathagata Das
> > } > > > > The first shutdown log message is always displayed, but the second message > never does. I’ve tried multiple permutations of the stop function calls and > even used try/catch around it. I’m running in yarn-cluster mode using Spark > 1.2 on CDH 5.3. I stop the

RE: Handling worker batch processing during driver shutdown

2015-03-13 Thread Jose Fernandez
k 1.2 on CDH 5.3. I stop the application with yarn application -kill . From: Tathagata Das [mailto:t...@databricks.com<mailto:t...@databricks.com>] Sent: Thursday, March 12, 2015 1:29 PM To: Jose Fernandez Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Handling wo

Re: Handling worker batch processing during driver shutdown

2015-03-12 Thread Tathagata Das
m running in yarn-cluster mode using Spark > 1.2 on CDH 5.3. I stop the application with yarn application -kill . > > > > > > *From:* Tathagata Das [mailto:t...@databricks.com] > *Sent:* Thursday, March 12, 2015 1:29 PM > *To:* Jose Fernandez > *Cc:* user@spark.apach

RE: Handling worker batch processing during driver shutdown

2015-03-12 Thread Jose Fernandez
: user@spark.apache.org Subject: Re: Handling worker batch processing during driver shutdown Can you access the batcher directly? Like is there is there a handle to get access to the batchers on the executors by running a task on that executor? If so, after the streamingContext has been stoppe

Re: Handling worker batch processing during driver shutdown

2015-03-12 Thread Tathagata Das
Can you access the batcher directly? Like is there is there a handle to get access to the batchers on the executors by running a task on that executor? If so, after the streamingContext has been stopped (not the SparkContext), then you can use `sc.makeRDD()` to run a dummy task like this. sc.makeR

Handling worker batch processing during driver shutdown

2015-03-12 Thread Jose Fernandez
Hi folks, I have a shutdown hook in my driver which stops the streaming context cleanly. This is great as workers can finish their current processing unit before shutting down. Unfortunately each worker contains a batch processor which only flushes every X entries. We’re indexing to different i