As Jacob mentions, the tree object is written in cython, and is pretty
heavy going.

However,

    clf.tree_.value / clf.class_weight.values()

might work for you?

If using the sample_weight as well, you would need to additionally scale
along the other axis too.

Alternatively, if only interested in the leaf nodes, the
DecisionTreeClassifier has an apply() method which returns the leaf ID for
any data passed to it. Use the original data, and then some light Pandas
pivoting should get you to what you need.



On Sun, Aug 30, 2015 at 11:54 AM, Jacob Schreiber <jmschreibe...@gmail.com>
wrote:

> You would have to modify sklearn/tree/_tree.pyx. See the Tree class near
> the bottom, and its list of properties. An issue may be that you would have
> to extensively modify the code, as you would need to modify both splitter
> and criterion objects as well. If you are doing this for your own personal
> use, it may be easier to write a small script which successively applies
> the rules of the tree to your data to see how many points from each class
> are present.
>
> On Sun, Aug 30, 2015 at 10:50 AM, Rex X <dnsr...@gmail.com> wrote:
>
>> Hi Jacob and Trevor,
>>
>> Which part of the source code we can modify to add a new attribute to
>> DecisionTreeClassifier.tree_, to count the number of samples of each
>> class within each node?
>>
>> Could you point me the right direction?
>>
>> Best,
>> Rex
>>
>>
>>
>>
>> On Sun, Aug 30, 2015 at 8:12 AM, Jacob Schreiber <jmschreibe...@gmail.com
>> > wrote:
>>
>>> This value is computed while building the tree, but is not kept in the
>>> tree.
>>>
>>> On Sun, Aug 30, 2015 at 7:02 AM, Rex X <dnsr...@gmail.com> wrote:
>>>
>>>> DecisionTreeClassifier.tree_.n_node_samples is the total number of
>>>> samples in all classes of one node, and
>>>> DecisionTreeClassifier.tree_.value is the computed weight for each
>>>> class of one node. Only if the sample_weight and class_weight of this 
>>>> DecisionTreeClassifier
>>>> is one, then this attribute equals the number of samples of each class of
>>>> one node.
>>>>
>>>> But for the general case with a given sample_weight and class_weight,
>>>> is there any attribute telling us the number of samples of each class
>>>> within one node?
>>>>
>>>>
>>>> import pandas as pd
>>>> from sklearn.datasets import load_iris
>>>> from sklearn import tree
>>>> import sklearn
>>>>
>>>> iris = sklearn.datasets.load_iris()
>>>> clf = tree.DecisionTreeClassifier(class_weight={0 : 0.30, 1: 0.3,
>>>> 2:0.4}, max_features="auto")
>>>> clf.fit(iris.data, iris.target)
>>>>
>>>>
>>>> # the total number of samples in all classes of each node
>>>> clf.tree_.n_node_samples
>>>>
>>>> # the computed weight for each class of each node
>>>> clf.tree_.value
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to