Hi Samir, the following visualization might be useful for gaining intuition on the meaning of a negative r2: https://gist.github.com/WittmannF/02060b45ce3ec9239898a5b91df2564e
A negative r2 is reflects into a model predicting the opposite trend of the data. On Sat, Aug 14, 2021, 03:17 Samir K Mahajan <samirkmahajan1...@gmail.com> wrote: > Dear Chrisophe, > I think you are oversimplifying by saying econometrics tools are for > inference. Forecasting and prediction are integral parts of econometric > analysis. Econometricians forecast by inferring the right conclusion > about the model . I wish to convey to you that I teach both > statistics and econometrics, and am now learning ML. There is a > fundamental difference among statistics, econometrics and machine > learning. > Regards, > > Samir K Mahajan > > On Fri, Aug 13, 2021 at 3:39 PM Christophe Pallier <christo...@pallier.org> > wrote: > >> Indeed , this is basically what I told you (you do not be need to copy >> textbook stuff: I taught probas/stats) : these are mostly problems for >> *inference*. >> >> On Fri, 13 Aug 2021, 12:03 Samir K Mahajan, <samirkmahajan1...@gmail.com> >> wrote: >> >>> >>> Dear Christophe Pallier*,* >>> >>> When we are doing prediction, we are relying on the values of the >>> coefficients of the model created. We are feeding test data on the model >>> for prediction. We may be nterested to see if the OLS >>> estimators(coefficients) are BLUE or not. In the presence of >>> autocorrelation (normally noticed in time series data), residuals are not >>> independent, and as such the OLS estimators are not BLUE in the sense that >>> they don't have minimum variance, and thus no more efficient estimators. >>> Statistical tests (t, F and *χ*2) may not be valid. We may reject the >>> model to make predictions in such a situation. . We have to rely upon >>> other improved models. There may be issues relating to multicollinearity >>> (in case of multivariable regression model) and heteroscedasticity (mostly >>> seen in cross-section data) too in a model. Can we discard these tools >>> while predicting a model? >>> >>> Regards, >>> >>> Samir K Mahajan >>> >>> >>> On Fri, Aug 13, 2021 at 1:07 PM Christophe Pallier < >>> christo...@pallier.org> wrote: >>> >>>> Actually, multicollinearity and autocorrelation are problems for >>>> *inference* more than for *prediction*. For example, if there is >>>> autocorrelation, the residuals are not independent, and the degrees of >>>> freedom are wrong for the tests in an OLS model (but you can use, e.g., an >>>> AR1 model). >>>> >>>> On Thu, 12 Aug 2021, 22:32 Samir K Mahajan, < >>>> samirkmahajan1...@gmail.com> wrote: >>>> >>>>> A note please (to Sebastian Raschka, mrschots). >>>>> >>>>> >>>>> The OLS model that I used ( where the test score gave me a >>>>> negative value) was not a good fit. Initial findings showed that t*he >>>>> regression coefficients and the model as a whole were significant, >>>>> *yet >>>>> , finally , it failed in two econometrics tests such as VIF (used for >>>>> detecting multicollinearity ) and Durbin-Watson test ( used for detecting >>>>> auto-correlation). *Presence of multicollinearity and >>>>> autocorrelation problems * in the model make it unsuitable for >>>>> prediction. >>>>> Regards, >>>>> >>>>> Samir K Mahajan. >>>>> >>>>> On Fri, Aug 13, 2021 at 1:41 AM Samir K Mahajan < >>>>> samirkmahajan1...@gmail.com> wrote: >>>>> >>>>>> Thanks to all of you for your kind response. Indeed, it is a >>>>>> great learning experience. Yes, econometrics books too create models >>>>>> for >>>>>> prediction, and programming really makes things better in a complex >>>>>> world. My understanding is that machine learning does depend on >>>>>> econometrics too. >>>>>> >>>>>> My Regards, >>>>>> >>>>>> Samir K Mahajan >>>>>> >>>>>> On Fri, Aug 13, 2021 at 1:21 AM Sebastian Raschka < >>>>>> m...@sebastianraschka.com> wrote: >>>>>> >>>>>>> The R2 function in scikit-learn works fine. A negative means that >>>>>>> the regression model fits the data worse than a horizontal line >>>>>>> representing the sample mean. E.g. you usually get that if you are >>>>>>> overfitting the training set a lot and then apply that model to the test >>>>>>> set. The econometrics book probably didn't cover applying a model to an >>>>>>> independent data or test set, hence the [0, 1] suggestion. >>>>>>> >>>>>>> Cheers, >>>>>>> Sebastian >>>>>>> >>>>>>> >>>>>>> On Aug 12, 2021, 2:20 PM -0500, Samir K Mahajan < >>>>>>> samirkmahajan1...@gmail.com>, wrote: >>>>>>> >>>>>>> >>>>>>> Dear Christophe Pallier, Reshama Saikh and Tromek Drabas, >>>>>>> Thank you for your kind response. Fair enough. I go with you R2 is >>>>>>> not a square. However, if you open any book of econometrics, it says >>>>>>> R2 >>>>>>> is a ratio that lies between 0 and 1. *This is the constraint.* >>>>>>> It measures the proportion or percentage of the total variation in >>>>>>> response variable (Y) explained by the regressors (Xs) in the model . >>>>>>> Remaining proportion of variation in Y, if any, is explained by the >>>>>>> residual term(u) Now, sklearn.matrics. metrics.r2_score gives me a >>>>>>> negative >>>>>>> value lying on a linear scale (-5.763335245921777). This negative >>>>>>> value breaks the *constraint.* I just want to highlight that. I >>>>>>> think it needs to be corrected. Rest is up to you . >>>>>>> >>>>>>> I find that Reshama Saikh is hurt by my email. I am really sorry >>>>>>> for that. Please note I never undermine your capabilities and >>>>>>> initiatives. >>>>>>> You are great people doing great jobs. I realise that I should have been >>>>>>> more sensible. >>>>>>> >>>>>>> My regards to all of you. >>>>>>> >>>>>>> Samir K Mahajan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 12, 2021 at 12:02 PM Christophe Pallier < >>>>>>> christo...@pallier.org> wrote: >>>>>>> >>>>>>>> Simple: despite its name R2 is not a square. Look up its definition. >>>>>>>> >>>>>>>> On Wed, 11 Aug 2021, 21:17 Samir K Mahajan, < >>>>>>>> samirkmahajan1...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Dear All, >>>>>>>>> I am amazed to find negative values of sklearn.metrics.r2_score >>>>>>>>> and sklearn.metrics.explained_variance_score in a model ( cross >>>>>>>>> validation >>>>>>>>> of OLS regression model) >>>>>>>>> However, what amuses me more is seeing you justifying negative >>>>>>>>> 'sklearn.metrics.r2_score ' in your documentation. This does not >>>>>>>>> make sense to me . Please justify to me how squared values are >>>>>>>>> negative. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Samir K Mahajan. >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> scikit-learn mailing list >>>>>>>>> scikit-learn@python.org >>>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> scikit-learn mailing list >>>>>>>> scikit-learn@python.org >>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> scikit-learn mailing list >>>>>>> scikit-learn@python.org >>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>>> >>>>>>> _______________________________________________ >>>>>>> scikit-learn mailing list >>>>>>> scikit-learn@python.org >>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>>> >>>>>> _______________________________________________ >>>>> scikit-learn mailing list >>>>> scikit-learn@python.org >>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn