Hi again,

I am currently revisiting this problem after familiarizing myself with
Cython and Scikit-Learn's code and I have a very important query:

Looking at the class MSE(RegressionCriterion), the node impurity is defined
as the variance of the target values Y on that node. The predictions X are
nowhere involved in the computations. This contradicts my notion of "loss
function", which quantifies the discrepancy between predicted and target
values. Am I looking at the wrong class or what I want to do is just not
feasible with Random Forests? For example, I would like to modify the
RandomForestRegressor code to minimize the Pearson's R between predicted
and target values.

I thank you in advance for any clarification.
Thomas



>
>> On 02/15/2018 01:28 PM, Guillaume Lemaitre wrote:
>>
>> Yes you are right pxd are the header and pyx the definition. You need to
>> write a class as MSE. Criterion is an abstract class or base class (I don't
>> have it under the eye)
>>
>> @Andy: if I recall the PR, we made the classes public to enable such
>> custom criterion. However, ‎it is not documented since we were not
>> officially supporting it. So this is an hidden feature. We could always
>> discuss to make this feature more visible and document it.
>>
>>
>>
>


-- 

======================================================================

Dr Thomas Evangelidis

Post-doctoral Researcher
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/2S049,
62500 Brno, Czech Republic

email: tev...@pharm.uoa.gr

          teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to