Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19433
> We'll actually only have to run an O(n log n) sort on continuous feature
values once (i.e. in the FeatureVector constructor), since once the continuous
features are sorted we can update them as we would for categorical features
when splitting nodes (in O(n) time) and they'll remain sorted.
Nice! so only one pass global sort, and then each split only need O(n) time
copy.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]