You would have to modify sklearn/tree/_tree.pyx. See the Tree class near the bottom, and its list of properties. An issue may be that you would have to extensively modify the code, as you would need to modify both splitter and criterion objects as well. If you are doing this for your own personal use, it may be easier to write a small script which successively applies the rules of the tree to your data to see how many points from each class are present.
On Sun, Aug 30, 2015 at 10:50 AM, Rex X <dnsr...@gmail.com> wrote: > Hi Jacob and Trevor, > > Which part of the source code we can modify to add a new attribute to > DecisionTreeClassifier.tree_, to count the number of samples of each > class within each node? > > Could you point me the right direction? > > Best, > Rex > > > > > On Sun, Aug 30, 2015 at 8:12 AM, Jacob Schreiber <jmschreibe...@gmail.com> > wrote: > >> This value is computed while building the tree, but is not kept in the >> tree. >> >> On Sun, Aug 30, 2015 at 7:02 AM, Rex X <dnsr...@gmail.com> wrote: >> >>> DecisionTreeClassifier.tree_.n_node_samples is the total number of >>> samples in all classes of one node, and >>> DecisionTreeClassifier.tree_.value is the computed weight for each >>> class of one node. Only if the sample_weight and class_weight of this >>> DecisionTreeClassifier >>> is one, then this attribute equals the number of samples of each class of >>> one node. >>> >>> But for the general case with a given sample_weight and class_weight, is >>> there any attribute telling us the number of samples of each class >>> within one node? >>> >>> >>> import pandas as pd >>> from sklearn.datasets import load_iris >>> from sklearn import tree >>> import sklearn >>> >>> iris = sklearn.datasets.load_iris() >>> clf = tree.DecisionTreeClassifier(class_weight={0 : 0.30, 1: 0.3, >>> 2:0.4}, max_features="auto") >>> clf.fit(iris.data, iris.target) >>> >>> >>> # the total number of samples in all classes of each node >>> clf.tree_.n_node_samples >>> >>> # the computed weight for each class of each node >>> clf.tree_.value >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general