Yes, but you don't necessarily need to use dynamic allocation (just enable the external shuffle service).
On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel <[email protected]> wrote: > Do you mean this setup? > > https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation > > > > On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin <[email protected]> > wrote: > >> Without the exact error from the driver that caused the job to restart, >> it's hard to tell. But a simple way to improve things is to install the >> Spark shuffle service on the YARN nodes, so that even if an executor >> crashes, its shuffle output is still available to other executors. >> >> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel <[email protected]> >> wrote: >> >>> Hi, >>> >>> I have a spark job running on yarn-client mode. At some point during >>> Join stage, executor(container) runs out of memory and yarn kills it. Due >>> to this Entire job restarts! and it keeps doing it on every failure? >>> >>> What is the best way to checkpoint? I see there's checkpoint api and >>> other option might be to persist before Join stage. Would that prevent >>> retry of entire job? How about just retrying only the task that was >>> distributed to that faulty executor? >>> >>> Thanks >>> >>> >>> >>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> >>> >>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>> <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] >>> <https://twitter.com/Xactly> [image: Facebook] >>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>> <http://www.youtube.com/xactlycorporation> >> >> >> >> >> -- >> Marcelo >> > > > > > [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> > > <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] > <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] > <https://twitter.com/Xactly> [image: Facebook] > <https://www.facebook.com/XactlyCorp> [image: YouTube] > <http://www.youtube.com/xactlycorporation> > -- Marcelo
