[scikit-learn] Need Corresponding indices array of values in each split of a DesicisionTreeClassifier

Nixon Raj Tue, 07 Feb 2017 06:28:19 -0800

For Example, In the below decision tree dot file, I have 223 samples which
splits into [174, 49] in the first split and [110, 1] in the 2nd split


I would like to get the array of indices for the values of each split like

*[174, 49] and their corresponding indices (idx)  like [[0, 1 ,5,
7,....,200,221], [3, 4, 6, ....., 199,222,223]]*

*[110, 1] and their corresponding indices (idx) like [[0,5,....200,221],
[7]]*

Please help me

node [shape=box] ;
0 [label="X[0] <= 13.9191\nentropy = 0.7597\nsamples = 223\nvalue = [174,
49]"] ;
1 [label="X[1] <= 3.1973\nentropy = 0.0741\nsamples = 111\nvalue = [110,
1]"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="entropy = 0.0\nsamples = 109\nvalue = [109, 0]"] ;
1 -> 2 ;
3 [label="entropy = 1.0\nsamples = 2\nvalue = [1, 1]"] ;
1 -> 3 ;
4 [label="X[1] <= 3.1266\nentropy = 0.9852\nsamples = 112\nvalue = [64,
48]"] ;
0 -> 4 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
5 [label="X[2] <= -0.4882\nentropy = 0.7919\nsamples = 63\nvalue = [48,
15]"] ;
4 -> 5 ;
6 [label="entropy = 0.684\nsamples = 11\nvalue = [2, 9]"] ;
5 -> 6 ;
7 [label="X[2] <= 0.5422\nentropy = 0.5159\nsamples = 52\nvalue = [46, 6]"]
;
5 -> 7 ;
8 [label="entropy = 0.0\nsamples = 18\nvalue = [18, 0]"] ;
7 -> 8 ;
9 [label="X[2] <= 0.6497\nentropy = 0.6723\nsamples = 34\nvalue = [28, 6]"]
;
7 -> 9 ;
10 [label="entropy = 0.0\nsamples = 1\nvalue = [0, 1]"] ;
9 -> 10 ;
11 [label="X[2] <= 1.887\nentropy = 0.6136\nsamples = 33\nvalue = [28, 5]"]
;
9 -> 11 ;
12 [label="entropy = 0.0\nsamples = 12\nvalue = [12, 0]"] ;
11 -> 12 ;
13 [label="X[2] <= 2.6691\nentropy = 0.7919\nsamples = 21\nvalue = [16,
5]"] ;
11 -> 13 ;
14 [label="entropy = 0.8113\nsamples = 4\nvalue = [1, 3]"] ;
13 -> 14 ;
15 [label="entropy = 0.5226\nsamples = 17\nvalue = [15, 2]"] ;
13 -> 15 ;
16 [label="X[0] <= 17.3284\nentropy = 0.9113\nsamples = 49\nvalue = [16,
33]"] ;
4 -> 16 ;
17 [label="entropy = 0.9183\nsamples = 6\nvalue = [4, 2]"] ;
16 -> 17 ;
18 [label="X[2] <= 19.7048\nentropy = 0.8542\nsamples = 43\nvalue = [12,
31]"] ;
16 -> 18 ;
19 [label="X[2] <= 5.8511\nentropy = 0.8296\nsamples = 42\nvalue = [11,
31]"] ;
18 -> 19 ;
20 [label="X[0] <= 31.8916\nentropy = 0.878\nsamples = 37\nvalue = [11,
26]"] ;
19 -> 20 ;
21 [label="X[1] <= 3.3612\nentropy = 0.6666\nsamples = 23\nvalue = [4,
19]"] ;
20 -> 21 ;
22 [label="entropy = 0.8905\nsamples = 13\nvalue = [4, 9]"] ;
21 -> 22 ;
23 [label="entropy = 0.0\nsamples = 10\nvalue = [0, 10]"] ;
21 -> 23 ;
24 [label="entropy = 1.0\nsamples = 14\nvalue = [7, 7]"] ;
20 -> 24 ;
25 [label="entropy = 0.0\nsamples = 5\nvalue = [0, 5]"] ;
19 -> 25 ;
26 [label="entropy = 0.0\nsamples = 1\nvalue = [1, 0]"] ;
18 -> 26 ;
}

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Need Corresponding indices array of values in each split of a DesicisionTreeClassifier

Reply via email to