Yes, but you don't necessarily need to use dynamic allocation (just enable
the external shuffle service).

On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel <[email protected]> wrote:

> Do you mean this setup?
>
> https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation
>
>
>
> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin <[email protected]>
> wrote:
>
>> Without the exact error from the driver that caused the job to restart,
>> it's hard to tell. But a simple way to improve things is to install the
>> Spark shuffle service on the YARN nodes, so that even if an executor
>> crashes, its shuffle output is still available to other executors.
>>
>> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a spark job running on yarn-client mode. At some point during
>>> Join stage, executor(container) runs out of memory and yarn kills it. Due
>>> to this Entire job restarts! and it keeps doing it on every failure?
>>>
>>> What is the best way to checkpoint? I see there's checkpoint api and
>>> other option might be to persist before Join stage. Would that prevent
>>> retry of entire job? How about just retrying only the task that was
>>> distributed to that faulty executor?
>>>
>>> Thanks
>>>
>>>
>>>
>>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>>
>>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>>> <https://twitter.com/Xactly>  [image: Facebook]
>>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>>> <http://www.youtube.com/xactlycorporation>
>>
>>
>>
>>
>> --
>> Marcelo
>>
>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>
>



-- 
Marcelo

Reply via email to