I learned this from my co-worker, but it is relevant here.
Spark has lazy evaluation by default, which means that all of your code
does not get executed until you run your saveAsTextFile, which does not
tell you much about where the problem is occurring. In order to debug this
better, you might
These stack traces come from the stuck node? Looks like it's waiting on
data in BlockFetcherIterator. Waiting for data from another node. But you
say all other nodes were done? Very curious.
Maybe you could try turning on debug logging, and try to figure out what
happens in BlockFetcherIterator (