Re: [Scikit-learn-general] choice of regularization parameter grid for elastic net

Alexandre Gramfort Sun, 13 Oct 2013 13:10:42 -0700

hi James,

for a given value of l1_ratio, the grid of alphas is chosen in log scale
starting from alpha_max to alpha_max / 10**eps. Any value of alpha
larger than alpha_max will lead to a coef_ full of zeros.


HTH
Alex


On Fri, Oct 11, 2013 at 9:39 PM, James Jensen <jdjen...@eng.ucsd.edu> wrote:
> How is the default grid of alphas and L1 ratios chosen for scikit-learn's
> enet_cv, and what is the reasoning behind it? What other approaches exist
> for choosing this parameter grid, and what are they based on?
>
> I'm using elastic net to calculate regularized canonical correlation. Given
> data matrices X and Y, I find coefficient vectors a and b that maximize the
> correlation between Xa and Yb. This can be done by iteratively regressing X
> on Yb (to estimate a) and then Y on Xa (to estimate b), and repeating these
> two regressions until convergence.
>
> This iterative approach means that I have to do the model selection a level
> up from the regression (i.e. I can't use enet_cv or the like directly). I
> know I can choose from grids of parameters by cross-validation or
> permutation. But I am unsure about how to intelligently choose the sets of
> alpha and L1-ratio parameters to try. And since the parameters can be
> different for the two regressions, this quadratically increases the number
> of parameter combinations to try, so I need to choose the grid carefully.
>
> Some ideas I've had:
>
> Perhaps the ratio of samples to features can rule out certain regularization
> parameter values, i.e. if there are many more samples than features, too
> weak of regularization would be inappropriate. Has this been formalized
> mathematically? Wouldn't it depend on how strong the signal is, too?
> If the solution with a particular regularization strength is a vector of
> zeros (i.e. the regularization was too strong), then I can discard all
> stronger regularization parameters. This is obvious with only an L1 penalty;
> if alpha=0.1 is too strong, then alpha=0.5 will definitely also be too
> strong. I wonder about this in the case of elastic net. That is, if
> (alpha=0.1, l1_ratio=0.5) is too strong, does that mean (alpha=0.1,
> l1_ratio=0.9) will necessarily be too strong?
> And perhaps I could start with a coarse grid and then try again with more
> detail in a promising section of it. Any ideas on the best way of doing
> this?
>
>
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] choice of regularization parameter grid for elastic net

Reply via email to