Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/20806
@viirya ok. but there're already a class in ML use
`TypedImperativeAggregator`, see `Summarizer`.
And do you benchmark and compare this PR and `df.rdd.treeAggregate`?
Seems they're almost the same. Is there some difference which can make
remarkable performance improvement ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]