Thank you both for the papers references.

@ Andreas,
What is your take? And what are you implying?

The Breiman (2001) paper points out the black box vs. statistical approach.
I call them black box vs. open box. He advocates black box in the paper.
Black box:
y <--- nature <--- x

Open box:
y <--- linear regression <---- x

Decision trees and neural nets are black box model. They require large
amount of data to train, and skip the part where it tries to understand
nature.

Because it is a black box, you can't open up to see what's inside. Linear
regression is a very simple model that you can use to approximate nature,
but the key thing is that you need to know how the data are generated.

@ Brown,
I know nothing about molecular modeling. The paper your linked "Beware of
q2!" paper raises some interesting point, as far as I see in sklearn linear
regression, score is R^2.

On Wed, Jun 5, 2019 at 9:11 AM Andreas Mueller <t3k...@gmail.com> wrote:

>
> On 6/4/19 8:44 PM, C W wrote:
> > Thank you all for the replies.
> >
> > I agree that prediction accuracy is great for evaluating black-box ML
> > models. Especially advanced models like neural networks, or
> > not-so-black models like LASSO, because they are NP-hard to solve.
> >
> > Linear regression is not a black-box. I view prediction accuracy as an
> > overkill on interpretable models. Especially when you can use
> > R-squared, coefficient significance, etc.
> >
> > Prediction accuracy also does not tell you which feature is important.
> >
> > What do you guys think? Thank you!
> >
> Did you read the paper that I sent? ;)
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to