[ 
https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492887#comment-14492887
 ] 

Joseph K. Bradley commented on SPARK-3727:
------------------------------------------

Thanks for your initial works on this ticket!  The main issue with this 
extension is API stability: Modifying the existing classes will also make us 
have to update model save/load versioning, default constructors to ensure 
binary compatibility, etc.

I just linked a JIRA which discusses updating the tree and ensemble APIs under 
the spark.ml package, which will permit us to redesign the APIs (and make it 
easier to specify class probabilities or stats for regression).  What I'd like 
to do is get the tree API updates in (this week), and then we could work 
together to make the class probabilities available under the new API.

Does that sound good?

Also, if you're new to contributing to Spark, please make sure to check out: 
[https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark]

Thanks!

> DecisionTree, RandomForest: More prediction functionality
> ---------------------------------------------------------
>
>                 Key: SPARK-3727
>                 URL: https://issues.apache.org/jira/browse/SPARK-3727
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Joseph K. Bradley
>
> DecisionTree and RandomForest currently predict the most likely label for 
> classification and the mean for regression.  Other info about predictions 
> would be useful.
> For classification: estimated probability of each possible label
> For regression: variance of estimate
> RandomForest could also create aggregate predictions in multiple ways:
> * Predict mean or median value for regression.
> * Compute variance of estimates (across all trees) for both classification 
> and regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to