Re: [scikit-learn] How is linear regression in scikit-learn done? Do you need train and test split?

2019-06-01 Thread Joel Nothman
You're right that you don't need to use CV for hyperparameter estimation in linear regression, but you may want it for model evaluation. As far as I understand: Holding out a test set is recommended if you aren't entirely sure that the assumptions of the model are held (gaussian error on a linear

Re: [scikit-learn] How is linear regression in scikit-learn done? Do you need train and test split?

2019-06-01 Thread C W
Hi Nicholas, I don't get it. The coefficients are estimated through OLS. Essentially, you are just calculating a matrix pseudo inverse, where beta = (X^T * X)^(-1) * X^T * y Splitting the data does not improve the model, It only works in something like LASSO, where you have a tuning parameter.

Re: [scikit-learn] How is linear regression in scikit-learn done? Do you need train and test split?

2019-06-01 Thread Nicolas Hug
Splitting the data into train and test data is needed with any machine learning model (not just linear regression with or without least squares). The idea is that you want to evaluate the performance of your model (prediction + scoring) on a portion of the data that you did not use for trainin