Joseph K. Bradley created SPARK-7126:
----------------------------------------
Summary: For spark.ml Classifiers, automatically index labels if
they are not yet indexed
Key: SPARK-7126
URL: https://issues.apache.org/jira/browse/SPARK-7126
Project: Spark
Issue Type: Improvement
Components: ML
Affects Versions: 1.4.0
Reporter: Joseph K. Bradley
Now that we have StringIndexer, we could have
spark.ml.classification.Classifier (the abstraction) automatically handle label
indexing if the labels are not yet indexed.
This would require a bit of design:
* Should predict() output the original labels or the indices?
* How should we notify users that the labels are being automatically indexed?
* How should we provide that index to the users?
* If multiple parts of a Pipeline automatically index labels, what do we need
to do to make sure they are consistent?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]