I did not mean parameters of the cost function. I only want to scale the input variables. Suppose one of the independent variables has a range from 10 - 1000 and some other has a range in 0.1 - 1. Then Andrew Ng and others say in their machine learning lectures that one should rescale the input data to bring all variables to similar range (http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=MachineLearning&video=03.1-LinearRegressionII-FeatureScaling&speed=100) . This will affect how the gradient descent will behave.
We can choose cost function right now to be the squared loss function. On 26-04-2013 01:56, [email protected] wrote: > Date: Thu, 25 Apr 2013 19:15:59 +0100 From: Matthieu Brucher > <[email protected]> Subject: Re: [Scikit-learn-general] > Effects of shifting and scaling on Gradient Descent To: > [email protected] Message-ID: > <cahcackjlawxii48q5ftvf8-9-m0bre_rdpn8z7lj6by0xvs...@mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" Hi, Do you mean scaling > the parameters of the cost function? If so, scaling will change the > surface of the cost function, of course. It's kind of complicated to > say anything about how the surface will behave, it completely depends > of the cost function you are using. A cost function that is linear > will have the same scale applied to the surface, but anything fancier > will behave differently (squared sum, robust cost...) This also means > that the gradient descent will be different ans may converge to a > different location. As Ga?l said, this is a generic > optimization-related question, it is not machine-learning related. > Matthieu 2013/4/25 Shishir Pandey <[email protected]> >> >Thanks Ronnie for pointing out the exact method in the scikit-learn >> >library. Yes, that is exactly what I was asking how does the rescaling >> >of features affect the gradient descent algorithm. Since, stochastic >> >gradient descent is an algorithm which is used in machine learning quite >> >a lot. It will be good to understand how its performance is affected >> >after rescaling features. >> > >> >Jaques, I am having some trouble running the example. But yes it will be >> >good if we can have gui example. >> > >> >On 25-04-2013 19:12,[email protected] >> >wrote: >>> > >Date: Thu, 25 Apr 2013 09:10:35 -0400 >>> > >From: Ronnie Ghose<[email protected]> >>> > >Subject: Re: [Scikit-learn-general] Effects of shifting and scaling on >>> > > Gradient Descent >>> > >To:[email protected] >>> > >Message-ID: >>> > > <CAHazPTmZX1dmMT1Mm_hTQjyyB8aV5C= >> >[email protected]> >>> > >Content-Type: text/plain; charset="iso-8859-1" >>> > > >>> > >I think he means what increases/benefits do you get from rescaling >> >features >>> > >e.g. minmax or preprocessing.scale >>> > >On Thu, Apr 25, 2013 at 02:09:13PM +0200, Jaques Grobler wrote: >>>>>>> > >>> > >I also think it will be great to have this example on the >>>>>>> > >>> > >website. >>>>> > >> >Do you mean like an interactive example that works similiar to the >>>>> > >> >SVM >>>>> > >> >Gui example , but for understand the effects shifting and scaling of >>>>> > >> >data has on the rate of convergence of gradient descent and the >>>>> > >> >surface >>>>> > >> >of the cost function? >>> > >This is out of scope for the project: scikit-learn is a machine learning >>> > >toolkit. Gradient descent is a general class of optimization algorithms. >>> > > >>> > >Ga?l >> > >> >-- >> >sp >> > >> > -- sp ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
