I've been playing around with Lasso and Lars, but there's something that
bothers me about standardization.

If I don't standardize to N(0, 1), these procedures indicate that a certain
set of variables are the most important. Yet, if I standardize, I get a
completely different set of variables. As expected, the lars or lasso plots
from varying alpha look very different. I know there's  a good reason for
this, but then what's the right way to identify the important variables
from a large set?

I could take prediction quality on testing data, but there's a conflict if
the important variables are so different under standardization.

Any help or pointers is appreciated.

Best Regards.
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to