Github user srowen commented on the issue:
https://github.com/apache/spark/pull/19232
Yeah I wonder if this slows things down for smaller data sets, because of
the extra levels and checks, but then again, when the aggregation is small,
anything's similarly fast. The default depth is shallow and there are checks to
eval how well it works. I tend to support this if there is empirical evidence
that it speeds it up at scale.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]