Indeed , this is basically what I told you (you do not be need to copy textbook stuff: I taught probas/stats) : these are mostly problems for *inference*.
On Fri, 13 Aug 2021, 12:03 Samir K Mahajan, <samirkmahajan1...@gmail.com> wrote: > > Dear Christophe Pallier*,* > > When we are doing prediction, we are relying on the values of the > coefficients of the model created. We are feeding test data on the model > for prediction. We may be nterested to see if the OLS > estimators(coefficients) are BLUE or not. In the presence of > autocorrelation (normally noticed in time series data), residuals are not > independent, and as such the OLS estimators are not BLUE in the sense that > they don't have minimum variance, and thus no more efficient estimators. > Statistical tests (t, F and *χ*2) may not be valid. We may reject the > model to make predictions in such a situation. . We have to rely upon > other improved models. There may be issues relating to multicollinearity > (in case of multivariable regression model) and heteroscedasticity (mostly > seen in cross-section data) too in a model. Can we discard these tools > while predicting a model? > > Regards, > > Samir K Mahajan > > > On Fri, Aug 13, 2021 at 1:07 PM Christophe Pallier <christo...@pallier.org> > wrote: > >> Actually, multicollinearity and autocorrelation are problems for >> *inference* more than for *prediction*. For example, if there is >> autocorrelation, the residuals are not independent, and the degrees of >> freedom are wrong for the tests in an OLS model (but you can use, e.g., an >> AR1 model). >> >> On Thu, 12 Aug 2021, 22:32 Samir K Mahajan, <samirkmahajan1...@gmail.com> >> wrote: >> >>> A note please (to Sebastian Raschka, mrschots). >>> >>> >>> The OLS model that I used ( where the test score gave me a negative >>> value) was not a good fit. Initial findings showed that t*he >>> regression coefficients and the model as a whole were significant, *yet >>> , finally , it failed in two econometrics tests such as VIF (used for >>> detecting multicollinearity ) and Durbin-Watson test ( used for detecting >>> auto-correlation). *Presence of multicollinearity and autocorrelation >>> problems * in the model make it unsuitable for prediction. >>> Regards, >>> >>> Samir K Mahajan. >>> >>> On Fri, Aug 13, 2021 at 1:41 AM Samir K Mahajan < >>> samirkmahajan1...@gmail.com> wrote: >>> >>>> Thanks to all of you for your kind response. Indeed, it is a >>>> great learning experience. Yes, econometrics books too create models for >>>> prediction, and programming really makes things better in a complex >>>> world. My understanding is that machine learning does depend on >>>> econometrics too. >>>> >>>> My Regards, >>>> >>>> Samir K Mahajan >>>> >>>> On Fri, Aug 13, 2021 at 1:21 AM Sebastian Raschka < >>>> m...@sebastianraschka.com> wrote: >>>> >>>>> The R2 function in scikit-learn works fine. A negative means that the >>>>> regression model fits the data worse than a horizontal line representing >>>>> the sample mean. E.g. you usually get that if you are overfitting the >>>>> training set a lot and then apply that model to the test set. The >>>>> econometrics book probably didn't cover applying a model to an independent >>>>> data or test set, hence the [0, 1] suggestion. >>>>> >>>>> Cheers, >>>>> Sebastian >>>>> >>>>> >>>>> On Aug 12, 2021, 2:20 PM -0500, Samir K Mahajan < >>>>> samirkmahajan1...@gmail.com>, wrote: >>>>> >>>>> >>>>> Dear Christophe Pallier, Reshama Saikh and Tromek Drabas, >>>>> Thank you for your kind response. Fair enough. I go with you R2 is >>>>> not a square. However, if you open any book of econometrics, it says R2 >>>>> is a ratio that lies between 0 and 1. *This is the constraint.* It >>>>> measures the proportion or percentage of the total variation in response >>>>> variable (Y) explained by the regressors (Xs) in the model . Remaining >>>>> proportion of variation in Y, if any, is explained by the residual >>>>> term(u) >>>>> Now, sklearn.matrics. metrics.r2_score gives me a negative value lying on >>>>> a >>>>> linear scale (-5.763335245921777). This negative value breaks the >>>>> *constraint.* I just want to highlight that. I think it needs to be >>>>> corrected. Rest is up to you . >>>>> >>>>> I find that Reshama Saikh is hurt by my email. I am really sorry for >>>>> that. Please note I never undermine your capabilities and initiatives. >>>>> You >>>>> are great people doing great jobs. I realise that I should have been more >>>>> sensible. >>>>> >>>>> My regards to all of you. >>>>> >>>>> Samir K Mahajan >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Aug 12, 2021 at 12:02 PM Christophe Pallier < >>>>> christo...@pallier.org> wrote: >>>>> >>>>>> Simple: despite its name R2 is not a square. Look up its definition. >>>>>> >>>>>> On Wed, 11 Aug 2021, 21:17 Samir K Mahajan, < >>>>>> samirkmahajan1...@gmail.com> wrote: >>>>>> >>>>>>> Dear All, >>>>>>> I am amazed to find negative values of sklearn.metrics.r2_score >>>>>>> and sklearn.metrics.explained_variance_score in a model ( cross >>>>>>> validation >>>>>>> of OLS regression model) >>>>>>> However, what amuses me more is seeing you justifying negative >>>>>>> 'sklearn.metrics.r2_score ' in your documentation. This does not >>>>>>> make sense to me . Please justify to me how squared values are negative. >>>>>>> >>>>>>> Regards, >>>>>>> Samir K Mahajan. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> scikit-learn mailing list >>>>>>> scikit-learn@python.org >>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>>> >>>>>> _______________________________________________ >>>>>> scikit-learn mailing list >>>>>> scikit-learn@python.org >>>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>>> >>>>> _______________________________________________ >>>>> scikit-learn mailing list >>>>> scikit-learn@python.org >>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>> >>>>> _______________________________________________ >>>>> scikit-learn mailing list >>>>> scikit-learn@python.org >>>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>>> >>>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn