[
https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492931#comment-14492931
]
Joseph K. Bradley commented on SPARK-3727:
------------------------------------------
[~maxkaznady] Implementations should be done in Scala; the PySpark API will be
a wrapper. The API update JIRA I'm referencing should clear up some of the
other questions.
> Trees and ensembles: More prediction functionality
> --------------------------------------------------
>
> Key: SPARK-3727
> URL: https://issues.apache.org/jira/browse/SPARK-3727
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Joseph K. Bradley
>
> DecisionTree and RandomForest currently predict the most likely label for
> classification and the mean for regression. Other info about predictions
> would be useful.
> For classification: estimated probability of each possible label
> For regression: variance of estimate
> RandomForest could also create aggregate predictions in multiple ways:
> * Predict mean or median value for regression.
> * Compute variance of estimates (across all trees) for both classification
> and regression.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]