Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-07 Thread Eustache DIEMERT
I tried to adjust stepSize between 1e-4 and 1, it doesn't seem to be the problem. Actually the problem is that the model doesn't use the intercept. So what happens is that it tries to compensate with super heavy weights ( 1e40) and ends up overflowing the model coefficients. MSE is exploding too,

Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-07 Thread Eustache DIEMERT
Well, why not, but IMHO MLLib Logistic Regression is unusable right now. The inability to use intercept is just a no-go. I could hack a column of ones to inject the intercept into the data but frankly it's a pithy to have to do so. 2014-07-05 23:04 GMT+02:00 DB Tsai dbt...@dbtsai.com: You may

Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-07 Thread Eustache DIEMERT
Ok, I've tried to add the intercept term myself (code here [1]), but with no luck. It seems that adding a column of ones doesn't help with convergence either. I may have missed something in the coding as I'm quite a noob in Scala, but printing the data seem to indicate I succeeded in adding the

Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-05 Thread DB Tsai
You may try LBFGS to have more stable convergence. In spark 1.1, we will be able to use LBFGS instead of GD in training process. On Jul 4, 2014 1:23 PM, Thomas Robert tho...@creativedata.fr wrote: Hi all, I too am having some issues with *RegressionWithSGD algorithms. Concerning your issue

Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-04 Thread Thomas Robert
Hi all, I too am having some issues with *RegressionWithSGD algorithms. Concerning your issue Eustache, this could be due to the fact that these regression algorithms uses a fixed step (that is divided by sqrt(iteration)). During my tests, quite often, the algorithm diverged an infinity cost, I

Re: [mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-03 Thread Eustache DIEMERT
Printing the model show the intercept is always 0 :( Should I open a bug for that ? 2014-07-02 16:11 GMT+02:00 Eustache DIEMERT eusta...@diemert.fr: Hi list, I'm benchmarking MLlib for a regression task [1] and get strange results. Namely, using RidgeRegressionWithSGD it seems the

[mllib] strange/buggy results with RidgeRegressionWithSGD

2014-07-02 Thread Eustache DIEMERT
Hi list, I'm benchmarking MLlib for a regression task [1] and get strange results. Namely, using RidgeRegressionWithSGD it seems the predicted points miss the intercept: {code} val trainedModel = RidgeRegressionWithSGD.train(trainingData, 1000) ... valuesAndPreds.take(10).map(t = println(t))