Liang-Chi Hsieh created SPARK-28722:
---------------------------------------
Summary: Change sequential label sorting in StringIndexer fit to
parallel
Key: SPARK-28722
URL: https://issues.apache.org/jira/browse/SPARK-28722
Project: Spark
Issue Type: Improvement
Components: ML
Affects Versions: 3.0.0
Reporter: Liang-Chi Hsieh
The fit method in StringIndexer sorts given labels in a sequential approach, if
there are multiple input columns. When the number of input column increases,
the time of label sorting dramatically increases too so it is hard to use in
practice if dealing with hundreds of input columns.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]