Re: [Scikit-learn-general] SGD learning rate heuristic

Andreas Müller Tue, 22 Nov 2011 04:47:57 -0800


----- Ursprüngliche Mail -----
Von: "Peter Prettenhofer" <[email protected]>
An: [email protected]
Gesendet: Dienstag, 22. November 2011 13:44:25
Betreff: Re: [Scikit-learn-general] SGD learning rate heuristic


2011/11/22 Andreas Müller <[email protected]>:
> Hi Peter.
> Thanks for the quick answer.
>
>
> On 11/22/2011 12:33 PM, Peter Prettenhofer wrote:
>> Hi Andy,
>>
>> I adopted the heuristic from Leon Bottou's sgd implementation (version
>> 1.3). He explains the heuristic in [1] - search for "Choosing the Gain
>> Schedule". I'm not aware of any paper which describes the rational in
>> more depth.
>>
>> Here's the quote from the slide: "Choose t_0 to make sure that the
>> expected initial updates are comparable with the expected size of the
>> weights. "
>>
>> [1] http://istcolloq.gsfc.nasa.gov/fall2009/presentations/bottou.pdf
>>
> I think I get the general idea.
> I don't know what the expected size of the weights is.
> Do you know what is meant by that?

Not really but I'll look into that and document it properly.

>>

Thinking about it, typw = sqrt(1.0 / sqrt(alpha)) is probably what he refers
to as expected w. Don't know why that is but if I believe that, I think
I can figure the rest out.
Thanks again!

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] SGD learning rate heuristic

Reply via email to