I noticed one more thing in the random forest code: The random forests averages the probabilities in the leaves. This is in contrast to Breiman 2001, where trees vote with hard class decisions afaik. As far as I can tell, that is not documented.
Has anyone tried both methods and @glouppe: why did you choose averaged probabilities over the hard method? Should the hard voting also be implemented in sklearn or would that overcomplicate things? Cheers, Andy ------------------------------------------------------------------------------ Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
