In the simplest case of a simple linear regression what you wrote holds true: the explained variance is simply a sum of variance explained by the model and the residual variability that cannot be explained, and that would always lie between 0 and 1. e.g. here: https://online.stat.psu.edu/stat500/lesson/9/9.3
However, this would be quite hard to do for more complex models (even for a multivariate linear regression) thus a need for a more general definition like here: https://en.wikipedia.org/wiki/Coefficient_of_determination or here https://www.investopedia.com/terms/r/r-squared.asp. I can easily envision a situation where data has outliers (i.e. data is not clean enough to be used in modeling) that it'd render a model that performs worse than a base model of simply taking average as a prediction for each observation. Cheers, -Tom On Thu, Aug 12, 2021 at 12:19 PM Samir K Mahajan < samirkmahajan1...@gmail.com> wrote: > > Dear Christophe Pallier, Reshama Saikh and Tromek Drabas, > Thank you for your kind response. Fair enough. I go with you R2 is not a > square. However, if you open any book of econometrics, it says R2 is a > ratio that lies between 0 and 1. *This is the constraint.* It measures > the proportion or percentage of the total variation in response > variable (Y) explained by the regressors (Xs) in the model . Remaining > proportion of variation in Y, if any, is explained by the residual term(u) > Now, sklearn.matrics. metrics.r2_score gives me a negative value lying on a > linear scale (-5.763335245921777). This negative value breaks the *constraint. > *I just want to highlight that. I think it needs to be corrected. Rest is > up to you . > > I find that Reshama Saikh is hurt by my email. I am really sorry for > that. Please note I never undermine your capabilities and initiatives. You > are great people doing great jobs. I realise that I should have been more > sensible. > > My regards to all of you. > > Samir K Mahajan > > > > > > > > > On Thu, Aug 12, 2021 at 12:02 PM Christophe Pallier < > christo...@pallier.org> wrote: > >> Simple: despite its name R2 is not a square. Look up its definition. >> >> On Wed, 11 Aug 2021, 21:17 Samir K Mahajan, <samirkmahajan1...@gmail.com> >> wrote: >> >>> Dear All, >>> I am amazed to find negative values of sklearn.metrics.r2_score and >>> sklearn.metrics.explained_variance_score in a model ( cross validation of >>> OLS regression model) >>> However, what amuses me more is seeing you justifying negative >>> 'sklearn.metrics.r2_score ' in your documentation. This does not >>> make sense to me . Please justify to me how squared values are negative. >>> >>> Regards, >>> Samir K Mahajan. >>> >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn