You are right that this isn't implemented. I presume you could propose
a PR for this. The impurity calculator implementations already receive
category counts. The only drawback I see is having to store N
probabilities at each leaf, not 1.

On Wed, Jan 21, 2015 at 3:36 PM, Zsolt Tóth <toth.zsolt....@gmail.com> wrote:
> Hi,
>
> I use DecisionTree for multi class classification.
> I can get the probability of the predicted label for every node in the
> decision tree from node.predict().prob(). Is it possible to retrieve or
> count the probability of every possible label class in the node?
> To be more clear:
> Say in Node A there are 4 of label 0.0, 2 of label 1.0 and 3 of label 2.0.
> If I'm correct predict.prob() is 4/9 in this case. I need the values 2/9 and
> 3/9 for the 2 other labels.
>
> It would be great to retrieve the exact count of label classes ([4,2,3] in
> the example) but I don't think thats possible now. Is something like this
> planned for a future release?
>
> Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to