[
https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511133#comment-14511133
]
Oscar Olmedo commented on SPARK-3727:
-------------------------------------
Hello,
Here is a [link to my fork |
https://github.com/apache/spark/compare/master...oscaroboto:master] of proposed
changes to return probability within mllib for DecisionTreeModel and
RandomForestModel. But, as [~josephkb] pointed out we should look into spark.ml
ProbabiliticClassifier class. I'm also not familiar with spark.ml so I will
take some time familiarize myself with it.
Thanks for all the efforts into getting probabilities.
Oscar
> Trees and ensembles: More prediction functionality
> --------------------------------------------------
>
> Key: SPARK-3727
> URL: https://issues.apache.org/jira/browse/SPARK-3727
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Joseph K. Bradley
>
> DecisionTree and RandomForest currently predict the most likely label for
> classification and the mean for regression. Other info about predictions
> would be useful.
> For classification: estimated probability of each possible label
> For regression: variance of estimate
> RandomForest could also create aggregate predictions in multiple ways:
> * Predict mean or median value for regression.
> * Compute variance of estimates (across all trees) for both classification
> and regression.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]