2011/11/22 Andreas Müller <[email protected]>:
> Hi Peter.
> Thanks for the quick answer.
>
>
> On 11/22/2011 12:33 PM, Peter Prettenhofer wrote:
>> Hi Andy,
>>
>> I adopted the heuristic from Leon Bottou's sgd implementation (version
>> 1.3). He explains the heuristic in [1] - search for "Choosing the Gain
>> Schedule". I'm not aware of any paper which describes the rational in
>> more depth.
>>
>> Here's the quote from the slide: "Choose t_0 to make sure that the
>> expected initial updates are comparable with the expected size of the
>> weights. "
>>
>> [1] http://istcolloq.gsfc.nasa.gov/fall2009/presentations/bottou.pdf
>>
> I think I get the general idea.
> I don't know what the expected size of the weights is.
> Do you know what is meant by that?

Not really but I'll look into that and document it properly.

>
>> 2011/11/22 Andreas Müller <[email protected]>:
>>> Hi everybody.
>>> [..]
>>> I thought the initial learning rate in sgd is choosen using
>>> a subset of the training set. This seems to be in
>>> contradiction to using the heuristic.
>>
>> That would be a great feature indeed. Any volunteers :-)
>>
> Maybe later ;)
>
> The question was a bit "what is Bottou actually using".
> I didn't see anything in his code that does the trying
> out procedure he describes.
> In the slides he says he does it for CRF so maybe it's
> just used there, not for SVM.

In version 1.3 he uses the probing procedure only for CRF (look for
`calibrating` in the code).

>
>>>
>>> Also, I think there is a typo in the doc where they
>>> explain the learning rate schedule.
>>> In 3.3.6.1 SGD, below the formular fo the schedule,
>>> it says "t_0 is the time step [...], t_0 is choosen
>>> automatically". I think the first "t_0" should
>>> actually be "t". Is that right?
>>
>> You are absolutely right - the description is messed up in various
>> ways... it should be "eta^{(t)} is given by 1.0 / (\alpha * (t_0 + t))
>> where t_0 is determined using the following heuristic (insert
>> heuristic)"
>> I'll fix that ASAP!
>>
> Thanks! I would have fixed it too but maybe you can do it better ;)
>
> Cheers,
> Andy
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to