Re: [MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights.

2014-10-09 Thread DB Tsai
Nice to hear that your experiment is consistent to my assumption. The
current L1/L2 will penalize the intercept as well which is not idea.
I'm working on GLMNET in Spark using OWLQN, and I can exactly get the
same solution as R but with scalability in # of rows and columns. Stay
tuned!

Sincerely,

DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Mon, Sep 29, 2014 at 11:45 AM, Yanbo Liang yanboha...@gmail.com wrote:
 Thank you for all your patient response.

 I can conclude that if the data is totally separable or over-fit occurs,
 weights may be different.
 And it also consistent with my experiment.

 I have evaluate two different dataset and the result as followed:
 Loss function: LogisticGradient
 Regularizer: L2
 regParam: 1.0
 numIterations: 1 (SGD)

 Dataset 1: spark-1.1.0/data/mllib/sample_binary_classification_data.txt
 # of classes: 2
 # of samples: 100
 # of features: 692
 areaUnderROC of both SGD and LBFGS can reach nearly 1.0
 Loss function of both optimization method converge nearly
 1.7147811767900675E-5 (very very small)
 Weights of each optimization method is different but looks like multiple
 relationship (not very strict) just as what DB Tsai mention above.  It might
 be the dataset is totally separable.

 Dataset 2:
 http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#german.numer
 # of classes: 2
 # of samples: 1000
 # of features: 24
 areaUnderROC of both SGD and LBFGS both are nearly 0.8
 Loss function of both optimization method converge nearly 0.5367041390107519
 Weights of each optimization method is just the same.



 2014-09-29 16:05 GMT+08:00 DB Tsai dbt...@dbtsai.com:

 Can you check the loss of both LBFGS and SGD implementation? One
 reason maybe SGD doesn't converge well and you can see that by
 comparing both log-likelihoods. One other potential reason maybe the
 label of your training data is totally separable, so you can always
 increase the log-likelihood by multiply a constant to the weights.

 Sincerely,

 DB Tsai
 ---
 My Blog: https://www.dbtsai.com
 LinkedIn: https://www.linkedin.com/in/dbtsai


 On Sun, Sep 28, 2014 at 11:48 AM, Yanbo Liang yanboha...@gmail.com
 wrote:
  Hi
 
  We have used LogisticRegression with two different optimization method
  SGD
  and LBFGS in MLlib.
  With the same dataset and the same training and test split, but get
  different weights vector.
 
  For example, we use
  spark-1.1.0/data/mllib/sample_binary_classification_data.txt as our
  training
  and test dataset.
  With LogisticRegressionWithSGD and LogisticRegressionWithLBFGS as
  training
  method and the same other parameters.
 
  The precisions of these two methods almost near 100% and AUCs are also
  near
  1.0.
  As far as I know, the convex optimization problem will converge to the
  global minimum value. (We use SGD with mini batch fraction as 1.0)
  But I got two different weights vector? Is this expectation or make
  sense?



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights.

2014-09-29 Thread Xiangrui Meng
The test accuracy doesn't mean the total loss. All points between (-1,
1) can separate points -1 and +1 and give you 1.0 accuracy, but their
coressponding loss are different. -Xiangrui

On Sun, Sep 28, 2014 at 2:48 AM, Yanbo Liang yanboha...@gmail.com wrote:
 Hi

 We have used LogisticRegression with two different optimization method SGD
 and LBFGS in MLlib.
 With the same dataset and the same training and test split, but get
 different weights vector.

 For example, we use
 spark-1.1.0/data/mllib/sample_binary_classification_data.txt as our training
 and test dataset.
 With LogisticRegressionWithSGD and LogisticRegressionWithLBFGS as training
 method and the same other parameters.

 The precisions of these two methods almost near 100% and AUCs are also near
 1.0.
 As far as I know, the convex optimization problem will converge to the
 global minimum value. (We use SGD with mini batch fraction as 1.0)
 But I got two different weights vector? Is this expectation or make sense?

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights.

2014-09-29 Thread Yanbo Liang
Thank you for all your patient response.

I can conclude that if the data is totally separable or over-fit occurs,
weights may be different.
And it also consistent with my experiment.

I have evaluate two different dataset and the result as followed:
Loss function: LogisticGradient
Regularizer: L2
regParam: 1.0
numIterations: 1 (SGD)

Dataset 1: spark-1.1.0/data/mllib/sample_binary_classification_data.txt
# of classes: 2
# of samples: 100
# of features: 692
areaUnderROC of both SGD and LBFGS can reach nearly 1.0
Loss function of both optimization method converge
nearly 1.7147811767900675E-5 (very very small)
Weights of each optimization method is different but looks like multiple
relationship (not very strict) just as what DB Tsai mention above.  It
might be the dataset is totally separable.

Dataset 2:
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#german.numer
# of classes: 2
# of samples: 1000
# of features: 24
areaUnderROC of both SGD and LBFGS both are nearly 0.8
Loss function of both optimization method converge nearly 0.5367041390107519
Weights of each optimization method is just the same.



2014-09-29 16:05 GMT+08:00 DB Tsai dbt...@dbtsai.com:

 Can you check the loss of both LBFGS and SGD implementation? One
 reason maybe SGD doesn't converge well and you can see that by
 comparing both log-likelihoods. One other potential reason maybe the
 label of your training data is totally separable, so you can always
 increase the log-likelihood by multiply a constant to the weights.

 Sincerely,

 DB Tsai
 ---
 My Blog: https://www.dbtsai.com
 LinkedIn: https://www.linkedin.com/in/dbtsai


 On Sun, Sep 28, 2014 at 11:48 AM, Yanbo Liang yanboha...@gmail.com
 wrote:
  Hi
 
  We have used LogisticRegression with two different optimization method
 SGD
  and LBFGS in MLlib.
  With the same dataset and the same training and test split, but get
  different weights vector.
 
  For example, we use
  spark-1.1.0/data/mllib/sample_binary_classification_data.txt as our
 training
  and test dataset.
  With LogisticRegressionWithSGD and LogisticRegressionWithLBFGS as
 training
  method and the same other parameters.
 
  The precisions of these two methods almost near 100% and AUCs are also
 near
  1.0.
  As far as I know, the convex optimization problem will converge to the
  global minimum value. (We use SGD with mini batch fraction as 1.0)
  But I got two different weights vector? Is this expectation or make
 sense?