[
https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493455#comment-14493455
]
Michael Kuhlen commented on SPARK-3727:
---------------------------------------
[~josephkb] The design document is great, thanks for sharing. Looks like a
great step forward. I'd be happy to work on either or both of the subtasks, but
note that I'm going to have to be a "weekend warrior" on this stuff (busy at
work during the week). I'm going to start by familiarizing myself with spark.ml
and the new API, to see if and how to port over the changes I've made so far.
> Trees and ensembles: More prediction functionality
> --------------------------------------------------
>
> Key: SPARK-3727
> URL: https://issues.apache.org/jira/browse/SPARK-3727
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Joseph K. Bradley
>
> DecisionTree and RandomForest currently predict the most likely label for
> classification and the mean for regression. Other info about predictions
> would be useful.
> For classification: estimated probability of each possible label
> For regression: variance of estimate
> RandomForest could also create aggregate predictions in multiple ways:
> * Predict mean or median value for regression.
> * Compute variance of estimates (across all trees) for both classification
> and regression.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]