Github user sethah commented on the issue:
https://github.com/apache/spark/pull/19232
I'm not really aware of situations where it would be detrimental, since it
has a mechanism for avoiding the intermediate stages when it doesn't make
sense. However, one of the big advantages of `treeAggregate` in the ML
algorithms is that we make the depth configurable. Here, they're hard-coded to
the default value. I don't know how involved the tests were, but for example
I'm surprised that there is much difference in the one-hot encoder.
I'm not against it, but it wouldn't hurt to understand the trade-offs a bit
more.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]