Hi again, I am currently revisiting this problem after familiarizing myself with Cython and Scikit-Learn's code and I have a very important query:
Looking at the class MSE(RegressionCriterion), the node impurity is defined as the variance of the target values Y on that node. The predictions X are nowhere involved in the computations. This contradicts my notion of "loss function", which quantifies the discrepancy between predicted and target values. Am I looking at the wrong class or what I want to do is just not feasible with Random Forests? For example, I would like to modify the RandomForestRegressor code to minimize the Pearson's R between predicted and target values. I thank you in advance for any clarification. Thomas > >> On 02/15/2018 01:28 PM, Guillaume Lemaitre wrote: >> >> Yes you are right pxd are the header and pyx the definition. You need to >> write a class as MSE. Criterion is an abstract class or base class (I don't >> have it under the eye) >> >> @Andy: if I recall the PR, we made the classes public to enable such >> custom criterion. However, it is not documented since we were not >> officially supporting it. So this is an hidden feature. We could always >> discuss to make this feature more visible and document it. >> >> >> > -- ====================================================================== Dr Thomas Evangelidis Post-doctoral Researcher CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/2S049, 62500 Brno, Czech Republic email: tev...@pharm.uoa.gr teva...@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn