Hi, Please try to use the SPARK UI from the way that AWS EMR recommends, it should be available from the resource manager. I never ever had any problem working with it. THAT HAS ALWAYS BEEN MY PRIMARY AND SOLE SOURCE OF DEBUGGING.
Sadly, I cannot be of much help unless we go for a screen share session over google chat or skype. Also, I ALWAYS prefer the maximize Resource Allocation setting in EMR to be set to true. Besides that, there is a metrics in the EMR console which shows the number of containers getting generated by your job on graphs. Regards, Gourav Sengupta On Fri, Dec 29, 2017 at 6:23 PM, Jeroen Miller <bluedasya...@gmail.com> wrote: > Hello, > > Just a quick update as I did not made much progress yet. > > On 28 Dec 2017, at 21:09, Gourav Sengupta <gourav.sengu...@gmail.com> > wrote: > > can you try to then use the EMR version 5.10 instead or EMR version 5.11 > instead? > > Same issue with EMR 5.11.0. Task 0 in one stage never finishes. > > > can you please try selecting a subnet which is in a different > availability zone? > > I did not try this yet. But why should that make a difference? > > > if possible just try to increase the number of task instances and see > the difference? > > I tried with 512 partitions -- no difference. > > > also in case you are using caching, > > No caching used. > > > Also can you please report the number of containers that your job is > creating by looking at the metrics in the EMR console? > > 8 containers if I trust the directories in j-xxx/containers/application_ > xxx/. > > > Also if you see the spark UI then you can easily see which particular > step is taking the longest period of time - you just have to drill in a bit > in order to see that. Generally in case shuffling is an issue then it > definitely appears in the SPARK UI as I drill into the steps and see which > particular one is taking the longest. > > I always have issues with the Spark UI on EC2 -- it never seems to be up > to date. > > JM > >