If not, try running a coalesce. Your data may have grown and is defaulting to a number of partitions that causing unnecessary overhead
On Thu, Nov 29, 2018 at 3:02 AM Conrad Lee <con...@parsely.com> wrote: > Thanks, I'll try using 5.17.0. > > For anyone trying to debug this problem in the future: In other jobs that > hang in the same manner, the thread dump didn't have any blocked threads, > so that might be a red herring. > > On Wed, Nov 28, 2018 at 4:34 PM Christopher Petrino < > christopher.petr...@gmail.com> wrote: > >> I ran into problems using 5.19 so I referred to 5.17 and it resolved my >> issues. >> >> On Wed, Nov 28, 2018 at 2:48 AM Conrad Lee <con...@parsely.com> wrote: >> >>> Hello Vadim, >>> >>> Interesting. I've only been running this job at scale for a couple >>> weeks so I can't say whether this is related to recent EMR changes. >>> >>> Much of the EMR-specific code for spark has to do with writing files to >>> s3. In this case I'm writing files to the cluster's HDFS though so my >>> sense is that this is a spark issue, not an EMR (but I'm not sure). >>> >>> Conrad >>> >>> On Tue, Nov 27, 2018 at 5:21 PM Vadim Semenov <va...@datadoghq.com> >>> wrote: >>> >>>> Hey Conrad, >>>> >>>> has it started happening recently? >>>> >>>> We recently started having some sporadic problems with drivers on EMR >>>> when it gets stuck, up until two weeks ago everything was fine. >>>> We're trying to figure out with the EMR team where the issue is coming >>>> from. >>>> On Tue, Nov 27, 2018 at 6:29 AM Conrad Lee <con...@parsely.com> wrote: >>>> > >>>> > Dear spark community, >>>> > >>>> > I'm running spark 2.3.2 on EMR 5.19.0. I've got a job that's hanging >>>> in the final stage--the job usually works, but I see this hanging behavior >>>> in about one out of 50 runs. >>>> > >>>> > The second-to-last stage sorts the dataframe, and the final stage >>>> writes the dataframe to HDFS. >>>> > >>>> > Here you can see the executor logs, which indicate that it has >>>> finished processing the task. >>>> > >>>> > Here you can see the thread dump from the executor that's hanging. >>>> Here's the text of the blocked thread. >>>> > >>>> > I tried to work around this problem by enabling speculation, but >>>> speculative execution never takes place. I don't know why. >>>> > >>>> > Can anyone here help me? >>>> > >>>> > Thanks, >>>> > Conrad >>>> >>>> >>>> >>>> -- >>>> Sent from my iPhone >>>> >>>