Re: Failure handling

2017-01-25 Thread Erwan ALLAIN
I agree

We are try catching streamingcontext.awaittermination and
When exception occurs we stop the streaming context and system.exit
(50)(equal to SparkUnhandledCode)

Sounds ok.

On Tuesday, January 24, 2017, Cody Koeninger  wrote:

> Can you identify the error case and call System.exit ?  It'll get
> retried on another executor, but as long as that one fails the same
> way...
>
> If you can identify the error case at the time you're doing database
> interaction and just prevent data being written then, that's what I
> typically do.
>
> On Tue, Jan 24, 2017 at 7:50 AM, Erwan ALLAIN  > wrote:
> > Hello guys,
> >
> > I have a question regarding how spark handle failure.
> >
> > I’m using kafka direct stream
> > Spark 2.0.2
> > Kafka 0.10.0.1
> >
> > Here is a snippet of code
> >
> > val stream = createDirectStream(….)
> >
> > stream
> >  .map(…)
> > .forEachRDD( doSomething)
> >
> > stream
> > .map(…)
> > .forEachRDD( doSomethingElse)
> >
> > The execution is in FIFO, so the first action ends after the second
> starts
> > so far so good.
> > However, I would like that when an error (fatal or not) occurs during the
> > execution of the first action, the streaming context is stopped
> immediately.
> > It's like the driver is not notified of the exception and launch the
> second
> > action.
> >
> > In our case, the second action is performing checkpointing in an external
> > database and we do not want to checkpoint if an error occurs before.
> > We do not want to rely on spark checkpoint as it causes issue when
> upgrading
> > application.
> >
> > Let me know if it’s not clear !
> >
> > Thanks !
> >
> > Erwan
>


Re: Failure handling

2017-01-24 Thread Cody Koeninger
Can you identify the error case and call System.exit ?  It'll get
retried on another executor, but as long as that one fails the same
way...

If you can identify the error case at the time you're doing database
interaction and just prevent data being written then, that's what I
typically do.

On Tue, Jan 24, 2017 at 7:50 AM, Erwan ALLAIN  wrote:
> Hello guys,
>
> I have a question regarding how spark handle failure.
>
> I’m using kafka direct stream
> Spark 2.0.2
> Kafka 0.10.0.1
>
> Here is a snippet of code
>
> val stream = createDirectStream(….)
>
> stream
>  .map(…)
> .forEachRDD( doSomething)
>
> stream
> .map(…)
> .forEachRDD( doSomethingElse)
>
> The execution is in FIFO, so the first action ends after the second starts
> so far so good.
> However, I would like that when an error (fatal or not) occurs during the
> execution of the first action, the streaming context is stopped immediately.
> It's like the driver is not notified of the exception and launch the second
> action.
>
> In our case, the second action is performing checkpointing in an external
> database and we do not want to checkpoint if an error occurs before.
> We do not want to rely on spark checkpoint as it causes issue when upgrading
> application.
>
> Let me know if it’s not clear !
>
> Thanks !
>
> Erwan

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Failure handling

2017-01-24 Thread Erwan ALLAIN
Hello guys,

I have a question regarding how spark handle failure.

I’m using kafka direct stream
Spark 2.0.2
Kafka 0.10.0.1

Here is a snippet of code

val stream = createDirectStream(….)

stream
 .map(…)
.forEachRDD( doSomething)

stream
.map(…)
.forEachRDD( doSomethingElse)

The execution is in FIFO, so the first action ends after the second starts
so far so good.
However, I would like that when an error (fatal or not) occurs during the
execution of the first action, the streaming context is stopped
immediately.
It's like the driver is not notified of the exception and launch the second
action.

In our case, the second action is performing checkpointing in an external
database and we do not want to checkpoint if an error occurs before.
We do not want to rely on spark checkpoint as it causes issue when
upgrading application.

Let me know if it’s not clear !

Thanks !

Erwan


Spark streaming micro batch failure handling

2016-06-08 Thread aviemzur
Hi,

A question about spark streaming handling of failed micro batch.

After a certain amount of task failures, there are no more retries, and the
entire batch fails.
What seems to happen next is that this batch is ignored and the next micro
batch begins, which means not all the data has been processed.

Is there a way to configure the spark streaming application to not continue
to the next batch, but rather stop the streaming context upon a micro batch
failure (After all task retries have been exhausted)?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-micro-batch-failure-handling-tp27110.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org