Re: Dealing with failures

2016-06-08 Thread Mohit Anchlia
On Wed, Jun 8, 2016 at 3:42 AM, Jacek Laskowski  wrote:

> On Wed, Jun 8, 2016 at 2:38 AM, Mohit Anchlia 
> wrote:
> > I am looking to write an ETL job using spark that reads data from the
> > source, perform transformation and insert it into the destination.
>
> Is this going to be one-time job or you want it to run every time interval?
>
> > 1. Source becomes slow or un-responsive. How to control such a situation
> so
> > that it doesn't cause DDoS on the source?
>
> Why do you think Spark would DDoS the source? I'm reading it as if
> Spark tried to open a new connection after the currently-open one
> became slow. I don't think it's how Spark does connections. What is
> the source in your use case?
>

>> I was primarily concerned about retires storms causing DDoS on the
source. How does spark deal with a scenario where it gets timeout from the
source. Does it retry or does it fail? And if the task fail does it fail
the job. And is it possible to restart the job and only process the failed
tasks and the remaining pending tasks? My use case is reading from
Cassandra performing some transformation and saving the data to a different
Cassandra cluster. I want to make sure that the data is reliably copied
without missing data. At the same time also want to make sure that the
process doesn't cause performance impact to other live production traffic
to these clusters when there are failures eg: DDoS or retry storms.


> > Also, at the same time how to make it resilient that it does pick up
> from where it left?
>
> It sounds like checkpointing. It's available in Core and Streaming.
> So, what's your source and how often do you want to query for data?
> You may also benefit from the recent additions to Spark in 2.0 called
> Structured Streaming (aka Streaming Datasets) - see
> https://issues.apache.org/jira/browse/SPARK-8360.
>
>
>> Does checkpointing help with the failure scenario that I described
above? I read checkpointing as a way to restart processing of data if tasks
fail because of spark cluster issues. Does it also work in the scenario
that I described?


> > 2. In the same context when destination becomes slow or un-responsive.
>
> What is a destination? It appears as if you were doing streaming and
> want to use checkpointing and back-pressure. But you haven't said much
> about your use case to be precise.
>
> Jacek
>


Re: Dealing with failures

2016-06-08 Thread Jacek Laskowski
On Wed, Jun 8, 2016 at 2:38 AM, Mohit Anchlia  wrote:
> I am looking to write an ETL job using spark that reads data from the
> source, perform transformation and insert it into the destination.

Is this going to be one-time job or you want it to run every time interval?

> 1. Source becomes slow or un-responsive. How to control such a situation so
> that it doesn't cause DDoS on the source?

Why do you think Spark would DDoS the source? I'm reading it as if
Spark tried to open a new connection after the currently-open one
became slow. I don't think it's how Spark does connections. What is
the source in your use case?

> Also, at the same time how to make it resilient that it does pick up from 
> where it left?

It sounds like checkpointing. It's available in Core and Streaming.
So, what's your source and how often do you want to query for data?
You may also benefit from the recent additions to Spark in 2.0 called
Structured Streaming (aka Streaming Datasets) - see
https://issues.apache.org/jira/browse/SPARK-8360.

> 2. In the same context when destination becomes slow or un-responsive.

What is a destination? It appears as if you were doing streaming and
want to use checkpointing and back-pressure. But you haven't said much
about your use case to be precise.

Jacek

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Dealing with failures

2016-06-07 Thread Mohit Anchlia
I am looking to write an ETL job using spark that reads data from the
source, perform transformation and insert it into the destination. I am
trying to understand how spark deals with failures? I can't seem to find
the documentation. I am interested in learning the following scenarios:
1. Source becomes slow or un-responsive. How to control such a situation so
that it doesn't cause DDoS on the source? Also, at the same time how to
make it resilient that it does pick up from where it left?
2. In the same context when destination becomes slow or un-responsive.