Hello, Just a quick update as I did not made much progress yet.
On 28 Dec 2017, at 21:09, Gourav Sengupta <gourav.sengu...@gmail.com> wrote: > can you try to then use the EMR version 5.10 instead or EMR version 5.11 > instead? Same issue with EMR 5.11.0. Task 0 in one stage never finishes. > can you please try selecting a subnet which is in a different availability > zone? I did not try this yet. But why should that make a difference? > if possible just try to increase the number of task instances and see the > difference? I tried with 512 partitions -- no difference. > also in case you are using caching, No caching used. > Also can you please report the number of containers that your job is creating > by looking at the metrics in the EMR console? 8 containers if I trust the directories in j-xxx/containers/application_xxx/. > Also if you see the spark UI then you can easily see which particular step is > taking the longest period of time - you just have to drill in a bit in order > to see that. Generally in case shuffling is an issue then it definitely > appears in the SPARK UI as I drill into the steps and see which particular > one is taking the longest. I always have issues with the Spark UI on EC2 -- it never seems to be up to date. JM --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org