>From what you are saying, the independent variables are the parameters of
the cost function. It is your search space, right?
If you change the scale, of course the gradient descent behavior will be
different. Also, if the input parameters are scaled properly, (let's say
that the variables that had a range of 10 to 1000 have now a range from
0.01 to 1), the only thing that will change is the scaling coordinates of
the cost function, but the minimum will remain identical (except for the
scale, of course). And a lot of algorithm behave better in that case (one
of the reason being machine precision).
Matthieu
2013/4/25 Shishir Pandey <[email protected]>
> I did not mean parameters of the cost function. I only want to scale the
> input variables. Suppose one of the independent variables has a range
> from 10 - 1000 and some other has a range in 0.1 - 1. Then Andrew Ng and
> others say in their machine learning lectures that one should rescale
> the input data to bring all variables to similar range
> (
> http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=MachineLearning&video=03.1-LinearRegressionII-FeatureScaling&speed=100
> )
> . This will affect how the gradient descent will behave.
>
> We can choose cost function right now to be the squared loss function.
>
> On 26-04-2013 01:56, [email protected]
> wrote:
> > Date: Thu, 25 Apr 2013 19:15:59 +0100 From: Matthieu Brucher
> > <[email protected]> Subject: Re: [Scikit-learn-general]
> > Effects of shifting and scaling on Gradient Descent To:
> > [email protected] Message-ID:
> > <cahcackjlawxii48q5ftvf8-9-m0bre_rdpn8z7lj6by0xvs...@mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1" Hi, Do you mean scaling
> > the parameters of the cost function? If so, scaling will change the
> > surface of the cost function, of course. It's kind of complicated to
> > say anything about how the surface will behave, it completely depends
> > of the cost function you are using. A cost function that is linear
> > will have the same scale applied to the surface, but anything fancier
> > will behave differently (squared sum, robust cost...) This also means
> > that the gradient descent will be different ans may converge to a
> > different location. As Ga?l said, this is a generic
> > optimization-related question, it is not machine-learning related.
> > Matthieu 2013/4/25 Shishir Pandey <[email protected]>
> >> >Thanks Ronnie for pointing out the exact method in the scikit-learn
> >> >library. Yes, that is exactly what I was asking how does the rescaling
> >> >of features affect the gradient descent algorithm. Since, stochastic
> >> >gradient descent is an algorithm which is used in machine learning
> quite
> >> >a lot. It will be good to understand how its performance is affected
> >> >after rescaling features.
> >> >
> >> >Jaques, I am having some trouble running the example. But yes it will
> be
> >> >good if we can have gui example.
> >> >
> >> >On 25-04-2013 19:12,[email protected]
> >> >wrote:
> >>> > >Date: Thu, 25 Apr 2013 09:10:35 -0400
> >>> > >From: Ronnie Ghose<[email protected]>
> >>> > >Subject: Re: [Scikit-learn-general] Effects of shifting and scaling
> on
> >>> > > Gradient Descent
> >>> > >To:[email protected]
> >>> > >Message-ID:
> >>> > > <CAHazPTmZX1dmMT1Mm_hTQjyyB8aV5C=
> >> >[email protected]>
> >>> > >Content-Type: text/plain; charset="iso-8859-1"
> >>> > >
> >>> > >I think he means what increases/benefits do you get from rescaling
> >> >features
> >>> > >e.g. minmax or preprocessing.scale
> >>> > >On Thu, Apr 25, 2013 at 02:09:13PM +0200, Jaques Grobler wrote:
> >>>>>>> > >>> > >I also think it will be great to have this example on the
> website.
> >>>>> > >> >Do you mean like an interactive example that works similiar to
> the SVM
> >>>>> > >> >Gui example , but for understand the effects shifting and
> scaling of
> >>>>> > >> >data has on the rate of convergence of gradient descent and
> the surface
> >>>>> > >> >of the cost function?
> >>> > >This is out of scope for the project: scikit-learn is a machine
> learning
> >>> > >toolkit. Gradient descent is a general class of optimization
> algorithms.
> >>> > >
> >>> > >Ga?l
> >> >
> >> >--
> >> >sp
> >> >
> >> >
>
> --
> sp
>
>
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
--
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general