As Jacob mentions, the tree object is written in cython, and is pretty heavy going.
However, clf.tree_.value / clf.class_weight.values() might work for you? If using the sample_weight as well, you would need to additionally scale along the other axis too. Alternatively, if only interested in the leaf nodes, the DecisionTreeClassifier has an apply() method which returns the leaf ID for any data passed to it. Use the original data, and then some light Pandas pivoting should get you to what you need. On Sun, Aug 30, 2015 at 11:54 AM, Jacob Schreiber <jmschreibe...@gmail.com> wrote: > You would have to modify sklearn/tree/_tree.pyx. See the Tree class near > the bottom, and its list of properties. An issue may be that you would have > to extensively modify the code, as you would need to modify both splitter > and criterion objects as well. If you are doing this for your own personal > use, it may be easier to write a small script which successively applies > the rules of the tree to your data to see how many points from each class > are present. > > On Sun, Aug 30, 2015 at 10:50 AM, Rex X <dnsr...@gmail.com> wrote: > >> Hi Jacob and Trevor, >> >> Which part of the source code we can modify to add a new attribute to >> DecisionTreeClassifier.tree_, to count the number of samples of each >> class within each node? >> >> Could you point me the right direction? >> >> Best, >> Rex >> >> >> >> >> On Sun, Aug 30, 2015 at 8:12 AM, Jacob Schreiber <jmschreibe...@gmail.com >> > wrote: >> >>> This value is computed while building the tree, but is not kept in the >>> tree. >>> >>> On Sun, Aug 30, 2015 at 7:02 AM, Rex X <dnsr...@gmail.com> wrote: >>> >>>> DecisionTreeClassifier.tree_.n_node_samples is the total number of >>>> samples in all classes of one node, and >>>> DecisionTreeClassifier.tree_.value is the computed weight for each >>>> class of one node. Only if the sample_weight and class_weight of this >>>> DecisionTreeClassifier >>>> is one, then this attribute equals the number of samples of each class of >>>> one node. >>>> >>>> But for the general case with a given sample_weight and class_weight, >>>> is there any attribute telling us the number of samples of each class >>>> within one node? >>>> >>>> >>>> import pandas as pd >>>> from sklearn.datasets import load_iris >>>> from sklearn import tree >>>> import sklearn >>>> >>>> iris = sklearn.datasets.load_iris() >>>> clf = tree.DecisionTreeClassifier(class_weight={0 : 0.30, 1: 0.3, >>>> 2:0.4}, max_features="auto") >>>> clf.fit(iris.data, iris.target) >>>> >>>> >>>> # the total number of samples in all classes of each node >>>> clf.tree_.n_node_samples >>>> >>>> # the computed weight for each class of each node >>>> clf.tree_.value >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general