I read 
http://stackoverflow.com/questions/15869919/random-forest-predict-using-less-estimators
 and 
http://stackoverflow.com/questions/14192284/random-forests-probability-estimates-scikit-learn-specific

Does that mean if I want the probability for a single tree I can just access to 
RandomForestClassifier.estimators_ which represents a single tree, then e.g. 
call predict_proba() for each tree's probability? 

Given a test like
for idx, tree in enumerate(model.estimators_):
  proba = tree.predict_proba(test_setx)
  print "idx %d:\n%s" % (idx, proba)

It looks what I am searching for, but can't be very sure because I notice the 
probability output for each tree looks like either 0 or 1

idx 0: [1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 
1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, ...

idx 1: [1.0, 1.0, 1.0, 1.0 ...  0.0, 0.0, 0.0, 0.0, 0.0]

idx 2: [1.0, 1.0, 1.0, 1.0 ... 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

idx 3: [1.0, 1.0, 1.0, 1.0, 1.0 ... 1.0, 1.0, 1.0, 0.0]

...

Original output for predict_proba from a single tree is as below
idx 0:
[[ 1.  0.]
 [ 1.  0.]
 [ 0.  1.]
 ..., 
 [ 1.  0.]
 [ 1.  0.]
 [ 0.  1.]]
idx 1:
[[ 1.  0.]
 [ 1.  0.]
 [ 1.  0.]
 ..., 
 [ 0.  1.]
 [ 0.  1.]
 [ 0.  1.]]

If this is the probability for each tree, how to calculate the probability for 
the ensemble? 

I appreciate any advice.

Thanks  

------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to