Hi, At the reduceBuyKey stage, it takes a few minutes before the tasks start working.
I have -Dspark.default.parallelism=127 cores (n-1). CPU/Network/IO is idling across all nodes when this is happening. And there is nothing particular on the master log file. From the spark-shell: 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:124 as TID 538 on executor 2: XXX (PROCESS_LOCAL) 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:124 as 38765155 bytes in 193 ms 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:125 as TID 539 on executor 1: XXX (PROCESS_LOCAL) 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:125 as 38765155 bytes in 96 ms 14/03/23 18:13:50 INFO TaskSetManager: Starting task 3.0:126 as TID 540 on executor 0: XXX (PROCESS_LOCAL) 14/03/23 18:13:50 INFO TaskSetManager: Serialized task 3.0:126 as 38765155 bytes in 100 ms But it stops there for some significant time before any movement. In the stage detail of the UI, I can see that there are 127 tasks running but the duration each is at least a few minutes. I’m working off local storage (not hdfs) and the kmeans data is about 6.5GB (50M rows). Is this a normal behaviour? Thanks!