----- Ursprüngliche Mail ----- Von: "Peter Prettenhofer" <[email protected]> An: [email protected] Gesendet: Dienstag, 22. November 2011 13:44:25 Betreff: Re: [Scikit-learn-general] SGD learning rate heuristic
2011/11/22 Andreas Müller <[email protected]>: > Hi Peter. > Thanks for the quick answer. > > > On 11/22/2011 12:33 PM, Peter Prettenhofer wrote: >> Hi Andy, >> >> I adopted the heuristic from Leon Bottou's sgd implementation (version >> 1.3). He explains the heuristic in [1] - search for "Choosing the Gain >> Schedule". I'm not aware of any paper which describes the rational in >> more depth. >> >> Here's the quote from the slide: "Choose t_0 to make sure that the >> expected initial updates are comparable with the expected size of the >> weights. " >> >> [1] http://istcolloq.gsfc.nasa.gov/fall2009/presentations/bottou.pdf >> > I think I get the general idea. > I don't know what the expected size of the weights is. > Do you know what is meant by that? Not really but I'll look into that and document it properly. >> Thinking about it, typw = sqrt(1.0 / sqrt(alpha)) is probably what he refers to as expected w. Don't know why that is but if I believe that, I think I can figure the rest out. Thanks again! ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
