Re: SparkListener onApplicationEnd processing an RDD throws exception because of stopped SparkContext

Sumona Routh Wed, 17 Feb 2016 10:48:58 -0800

Can anyone provide some insight into the flow of SparkListeners,
specifically onApplicationEnd? I'm having issues with the SparkContext
being stopped before my final processing can complete.


Thanks!
Sumona

On Mon, Feb 15, 2016 at 8:59 AM Sumona Routh <sumos...@gmail.com> wrote:

> Hi there,
> I am trying to implement a listener that performs as a post-processor
> which stores data about what was processed or erred. With this, I use an
> RDD that may or may not change during the course of the application.
>
> My thought was to use onApplicationEnd and then saveToCassandra call to
> persist this.
>
> From what I've gathered in my experiments,  onApplicationEnd  doesn't get
> called until sparkContext.stop() is called. If I don't call stop in my
> code, the listener won't be called. This works fine on my local tests -
> stop gets called, the listener is called and then persisted to the db, and
> everything works fine. However when I run this on our server,  the code in
> onApplicationEnd throws the following exception:
>
> Task serialization failed: java.lang.IllegalStateException: Cannot call
> methods on a stopped SparkContext
>
> What's the best way to resolve this? I can think of creating a new
> SparkContext in the listener (I think I have to turn on allowing multiple
> contexts, in case I try to create one before the other one is stopped). It
> seems odd but might be doable. Additionally, what if I were to simply add
> the code into my job in some sort of procedural block: doJob,
> doPostProcessing, does that guarantee postProcessing will occur after the
> other?
>
> We are currently using Spark 1.2 standalone at the moment.
>
> Please let me know if you require more details. Thanks for the assistance!
> Sumona
>
>

Re: SparkListener onApplicationEnd processing an RDD throws exception because of stopped SparkContext

Reply via email to