2011/11/22 Andreas Müller <[email protected]>: > Hi Peter. > Thanks for the quick answer. > > > On 11/22/2011 12:33 PM, Peter Prettenhofer wrote: >> Hi Andy, >> >> I adopted the heuristic from Leon Bottou's sgd implementation (version >> 1.3). He explains the heuristic in [1] - search for "Choosing the Gain >> Schedule". I'm not aware of any paper which describes the rational in >> more depth. >> >> Here's the quote from the slide: "Choose t_0 to make sure that the >> expected initial updates are comparable with the expected size of the >> weights. " >> >> [1] http://istcolloq.gsfc.nasa.gov/fall2009/presentations/bottou.pdf >> > I think I get the general idea. > I don't know what the expected size of the weights is. > Do you know what is meant by that?
Not really but I'll look into that and document it properly. > >> 2011/11/22 Andreas Müller <[email protected]>: >>> Hi everybody. >>> [..] >>> I thought the initial learning rate in sgd is choosen using >>> a subset of the training set. This seems to be in >>> contradiction to using the heuristic. >> >> That would be a great feature indeed. Any volunteers :-) >> > Maybe later ;) > > The question was a bit "what is Bottou actually using". > I didn't see anything in his code that does the trying > out procedure he describes. > In the slides he says he does it for CRF so maybe it's > just used there, not for SVM. In version 1.3 he uses the probing procedure only for CRF (look for `calibrating` in the code). > >>> >>> Also, I think there is a typo in the doc where they >>> explain the learning rate schedule. >>> In 3.3.6.1 SGD, below the formular fo the schedule, >>> it says "t_0 is the time step [...], t_0 is choosen >>> automatically". I think the first "t_0" should >>> actually be "t". Is that right? >> >> You are absolutely right - the description is messed up in various >> ways... it should be "eta^{(t)} is given by 1.0 / (\alpha * (t_0 + t)) >> where t_0 is determined using the following heuristic (insert >> heuristic)" >> I'll fix that ASAP! >> > Thanks! I would have fixed it too but maybe you can do it better ;) > > Cheers, > Andy > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > -- Peter Prettenhofer ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
