Hi Andy, I adopted the heuristic from Leon Bottou's sgd implementation (version 1.3). He explains the heuristic in [1] - search for "Choosing the Gain Schedule". I'm not aware of any paper which describes the rational in more depth.
Here's the quote from the slide: "Choose t_0 to make sure that the expected initial updates are comparable with the expected size of the weights. " [1] http://istcolloq.gsfc.nasa.gov/fall2009/presentations/bottou.pdf 2011/11/22 Andreas Müller <[email protected]>: > Hi everybody. > [..] > I thought the initial learning rate in sgd is choosen using > a subset of the training set. This seems to be in > contradiction to using the heuristic. That would be a great feature indeed. Any volunteers :-) > > Also, I think there is a typo in the doc where they > explain the learning rate schedule. > In 3.3.6.1 SGD, below the formular fo the schedule, > it says "t_0 is the time step [...], t_0 is choosen > automatically". I think the first "t_0" should > actually be "t". Is that right? You are absolutely right - the description is messed up in various ways... it should be "eta^{(t)} is given by 1.0 / (\alpha * (t_0 + t)) where t_0 is determined using the following heuristic (insert heuristic)" I'll fix that ASAP! thanks, Peter > > Any help would be appreciated! > > Thanks, > Andy > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > -- Peter Prettenhofer ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
