Spark 1.5 only support getting feature importance for
RandomForestClassificationModel and RandomForestRegressionModel by Scala.
We support this feature in PySpark until 2.0.0.
It's very straight forward with a few lines of code.
rf = RandomForestClassifier(numTrees=3, maxDepth=2, labelCol="indexed", seed=42)
model = rf.fit(td)
model.featureImportances
Then you can get the feature importances which is a Vector.
Thanks
Yanbo
2016-07-12 10:30 GMT-07:00 pseudo oduesp :
> Hi,
> i use pyspark 1.5.0
> can i ask you how i can get feature imprtance for a randomforest
> algorithme in pyspark and please give me example
> thanks for advance.
>