The Shalev-Schwartz et al. Pegasos update rule on the learning rate
parameter is

nu_i = 1 / (lambda * i)

where lambda multiplies the regularization term.  If this rule is used,
they show you can converge to an error of epsilon in O(1/(lambda*epsilon))
iterations at high probability.  This differs from SGDClassifier's
'optimal' learning_rate update by the added 1/lambda multiplier.

In SGDClassifier this corresponds to eta = nu, alpha = lambda and eta0
=1/alpha.  Just to make it explicit, the learning rate would be

eta0 = 1/alpha
eta = eta0 / power(t,power_t)

I can effectively apply Pegasos in a single SGDClassifier training run if I
set the following options (having chosen a value of alpha_this_step):

learning_rate='invscaling'
eta0=1/alpha_this_step
alpha=alpha_this_step
power_t=1

The problem is that I want to do GridSearchCV on a grid of alpha values,
and at each step I want to use the above learning rate update rules.  I
can't do this inside a grid search, so I propose the following optionality:

learning_rate='pegasos'
alpha=alpha_this_step

Under the hood, this would set eta0=1/alpha and power_t=1.

If this functionality already exists (grid search of SGD classification
using Pegasos learning rate update at each step), I'd appreciate if you'd
point it out to me.  Otherwise I'll submit a pull request.

Thanks!

Will
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to