You are right that this isn't implemented. I presume you could propose a PR for this. The impurity calculator implementations already receive category counts. The only drawback I see is having to store N probabilities at each leaf, not 1.
On Wed, Jan 21, 2015 at 3:36 PM, Zsolt Tóth <toth.zsolt....@gmail.com> wrote: > Hi, > > I use DecisionTree for multi class classification. > I can get the probability of the predicted label for every node in the > decision tree from node.predict().prob(). Is it possible to retrieve or > count the probability of every possible label class in the node? > To be more clear: > Say in Node A there are 4 of label 0.0, 2 of label 1.0 and 3 of label 2.0. > If I'm correct predict.prob() is 4/9 in this case. I need the values 2/9 and > 3/9 for the 2 other labels. > > It would be great to retrieve the exact count of label classes ([4,2,3] in > the example) but I don't think thats possible now. Is something like this > planned for a future release? > > Thanks! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org