[ https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511133#comment-14511133 ]
Oscar Olmedo commented on SPARK-3727: ------------------------------------- Hello, Here is a [link to my fork | https://github.com/apache/spark/compare/master...oscaroboto:master] of proposed changes to return probability within mllib for DecisionTreeModel and RandomForestModel. But, as [~josephkb] pointed out we should look into spark.ml ProbabiliticClassifier class. I'm also not familiar with spark.ml so I will take some time familiarize myself with it. Thanks for all the efforts into getting probabilities. Oscar > Trees and ensembles: More prediction functionality > -------------------------------------------------- > > Key: SPARK-3727 > URL: https://issues.apache.org/jira/browse/SPARK-3727 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Joseph K. Bradley > > DecisionTree and RandomForest currently predict the most likely label for > classification and the mean for regression. Other info about predictions > would be useful. > For classification: estimated probability of each possible label > For regression: variance of estimate > RandomForest could also create aggregate predictions in multiple ways: > * Predict mean or median value for regression. > * Compute variance of estimates (across all trees) for both classification > and regression. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org