I do not recall seeing support for missing values. Categorical values are encoded as 0.0, 1.0, 2.0, ... When training the model you indicate which are interpreted as categorical with the categoricalFeaturesInfo parameter, which maps feature offset to count of distinct categorical values for the feature.
On Sun, Jan 11, 2015 at 6:54 AM, Carter <gyz...@hotmail.com> wrote: > Hi, I am new to the MLlib in Spark. Can the DecisionTree model in MLlib deal > with missing values? If so, what data structure should I use for the input? > > Moreover, my data has categorical features, but the LabeledPoint requires > "double" data type, in this case what can I do? > > Thank you very much. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Does-DecisionTree-model-in-MLlib-deal-with-missing-values-tp21080.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org