Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-27 Thread Andreas Mueller
I don't think you can make any statements about the optimization method wrt the data when you don't specify the loss function you want to minimize. On 04/25/2013 03:10 PM, Ronnie Ghose wrote: I think he means what increases/benefits do you get from rescaling features e.g. minmax or

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Shishir Pandey
Even scikit-learn mentions on its stochastic gradient descent page: http://scikit-learn.org/dev/modules/sgd.html#tips-on-practical-use one should scale data. An example which shows what really happens to one cost function (say squared loss) on scaling the data would be great. On 26-04-2013

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Ronnie Ghose
afaik fits tend to work better and so do classifiers. it's much easier to have a classifier try to fit between -1 and 1 then 0 and 1 so it also helps convergence. http://stats.stackexchange.com/questions/41704/how-and-why-do-normalization-and-feature-scaling-work and then

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Peter Prettenhofer
(first-order) GD uses a single learning rate for all features - if features have a different variability its difficult to find a one-size-fits-all learning rate - the parameters of high variability features will tend to oscillate whereas the parameters of low variability features will converge too

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Jaques Grobler
@Shishir Pandey on a slight tangent, what problems are you having with running Libsvm GUI? I wonder if a GUI interactive example would really be necessary - we could just have an example illustrating the difference with plots when data is not scaled or scaled.. if people find that useful. But the

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Shishir Pandey
@Jaques Grobler: I ran the libsvm GUI code on the sklearn version 13.1 it was giving error importing - from sklearn.externals.six.move import xrange. But I commented the above line and it is working just fine. As you have suggested GUI example might not really be that necessary. Illustrating

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Gael Varoquaux
On Fri, Apr 26, 2013 at 04:17:36PM +0530, Shishir Pandey wrote: @Jaques Grobler: I ran the libsvm GUI code on the sklearn version 13.1 it was giving error importing - from sklearn.externals.six.move import xrange. Which error? Could you copy/paste it here? G

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-26 Thread Matthieu Brucher
From what you are saying, the independent variables are the parameters of the cost function. It is your search space, right? If you change the scale, of course the gradient descent behavior will be different. Also, if the input parameters are scaled properly, (let's say that the variables that had

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Jaques Grobler
I also think it will be great to have this example on the website. Do you mean like an interactive example that works similiar to the SVM Gui examplehttp://scikit-learn.org/dev/auto_examples/applications/svm_gui.html#example-applications-svm-gui-py , but for understand the effects shifting and

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Gael Varoquaux
On Thu, Apr 25, 2013 at 02:09:13PM +0200, Jaques Grobler wrote: I also think it will be great to have this example on the website. Do you mean like an interactive example that works similiar to the SVM Gui example , but for understand the effects shifting and scaling of data has on the rate

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Ronnie Ghose
I think he means what increases/benefits do you get from rescaling features e.g. minmax or preprocessing.scale On Thu, Apr 25, 2013 at 02:09:13PM +0200, Jaques Grobler wrote: I also think it will be great to have this example on the website. Do you mean like an interactive example that works

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Shishir Pandey
Thanks Ronnie for pointing out the exact method in the scikit-learn library. Yes, that is exactly what I was asking how does the rescaling of features affect the gradient descent algorithm. Since, stochastic gradient descent is an algorithm which is used in machine learning quite a lot. It

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Matthieu Brucher
Hi, Do you mean scaling the parameters of the cost function? If so, scaling will change the surface of the cost function, of course. It's kind of complicated to say anything about how the surface will behave, it completely depends of the cost function you are using. A cost function that is linear

Re: [Scikit-learn-general] Effects of shifting and scaling on Gradient Descent

2013-04-25 Thread Shishir Pandey
I did not mean parameters of the cost function. I only want to scale the input variables. Suppose one of the independent variables has a range from 10 - 1000 and some other has a range in 0.1 - 1. Then Andrew Ng and others say in their machine learning lectures that one should rescale the