2014-10-14 18:59 GMT-04:00 M Asad <masad....@gmail.com>: > I am not sure if there is already a method to get this but I have read docs > and there doesnt seem to be any. Please correct me if I am wrong. > > Actually I am trying to get probability distribution at each leaf node, as > done in the book "Decision Forests for Computer Vision and Medical Image > Analysis", for which I need the samples that ended up at each leaf node > during training. Then I will use kernel density estimation to get continuous > probability distribution at each leaf node. I have done this in my own > implementation in C++/OpenCV, however when using scikit all I need are those > particular samples at the leaf node. > > For prediction, I have used apply() to get index of the predicted leaf. > forestReg.estimators_[i].tree_.value[j] returns only one prediction value, > however if I call: forestReg.estimator_[i].tree_.n_node_samples[j] I get > number of samples to be more than min_samples_leaf ( which I have provided > to be 5 at the moment )
It can happen when you reach max_depth, or for regression tasks if all the samples in the lead of exact same target value. -- Olivier ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general