Joseph K. Bradley created SPARK-11535:
-----------------------------------------
Summary: StringIndexer should handle empty String specially
Key: SPARK-11535
URL: https://issues.apache.org/jira/browse/SPARK-11535
Project: Spark
Issue Type: Improvement
Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
StringIndexer will treat an empty string like any other string and index it
properly. However, the feature attribute name will be set to the empty string,
which causes a failure in OneHotEncoder. We should handle it specially by
calling it something like "(empty_string)" (and maybe append an integer if that
string already exists).
See [https://issues.apache.org/jira/browse/SPARK-10513] for a description of
the problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]