itHey all - I had a weird issue with my model fitting that I wanted to
share in the event that I'm doing something stupid.
I'm trying to fit a model with about 32 features x 20 time lags of those
features (so ~650 features total). I have about 12919 samples, all of which
are z-scored within features. Without a doubt, the data I'm dealing with is
quite noisy and there's a fair amount of correlation between features.
When I did a grid search for the best alpha, I noticed that it simply
returned whichever alpha was largest. AKA, even if I included alphas up to
~1e7, those would have the highest score. When I looked at the score across
alphas, this was confirmed. It was basically a monotonic increasing
function of alpha (at least up to the point that I stopped looking).
That's a bit weird to me, as usually I've found that the ridge parameter
settles somewhere between 1e-4 and .1
Has anyone experienced this before or have an intuition for why this is
happening? My first thought was that the model is basically a horrible one,
and so the grid search just gets whatever is closest to predicting the
mean. That said, if I use an alternate method like SGD with an l2
regularizer, then it chooses alpha values around .01, and the weights look
more like what I'd expect.
Anyone have thoughts on what is going on?
Chris
--
_____________________________________
PhD Candidate in Neuroscience | UC Berkeley <http://hwni.org/>
Editor and Web Master | Berkeley Science Review
<http://sciencereview.berkeley.edu/>
_____________________________________
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general