Joseph K. Bradley created SPARK-3041:
----------------------------------------
Summary: DecisionTree: isSampleValid indexing incorrect
Key: SPARK-3041
URL: https://issues.apache.org/jira/browse/SPARK-3041
Project: Spark
Issue Type: Bug
Components: MLlib
Reporter: Joseph K. Bradley
In DecisionTree, isSampleValid treats unordered categorical features
incorrectly: It treated the bins as if indexed by featured values, rather than
by subsets of values/categories.
This bug is exhibited for unordered features (multi-class classification with
categorical features of low arity).
Proposed fix: Index bins correctly for unordered categorical features.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]