I'm using Spark 2.0.0 to train a model with 1000w+ parameters, about 500GB
data. The treeAggregate is used to aggregate the gradient, when I set the
depth = 2 or 3, it works, and depth equals to 3 is faster.
So I set depth = 4 to obtain better performance, but now some executors
will be OOM in the shuffle phase. Why would this happen? With deeper depth,
each executor should aggregate less records and use less memory, I don't
know why OOM happens. Can someone help?

Reply via email to