Saving the DataFrame based RandomForestClassificationModels

James Hammerton Fri, 18 Mar 2016 20:53:46 -0700

Hi,

If you train a
org.apache.spark.ml.classification.RandomForestClassificationModel, you
can't save it - attempts to do so yield the following error:


16/03/18 14:12:44 INFO SparkContext: Successfully stopped SparkContext
> Exception in thread "main" java.lang.UnsupportedOperationException:
> Pipeline write will fail on this Pipeline because it contains a stage
> which does not implement Writable. Non-Writable stage: rfc_704981ba3f48
> of type class org.apache.spark.ml.classification.RandomForestClassifier
>         at org.apache.spark.ml.
> Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:
> 218)
>         at org.apache.spark.ml.
> Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:
> 215)


This appears to be a known bug:
https://issues.apache.org/jira/browse/SPARK-13784 related to
https://issues.apache.org/jira/browse/SPARK-11888

My question is whether there's a work around given that these bugs are
unresolved at least until 2.0.0.

Regards,

James

Saving the DataFrame based RandomForestClassificationModels

Reply via email to