[
https://issues.apache.org/jira/browse/SPARK-20810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017300#comment-16017300
]
Yanbo Liang commented on SPARK-20810:
-------------------------------------
[~srowen] Thanks for your comments. I'm sure both are converged. ML LinearSVC
converged after 143 epoch, and MLlib SVMWithSGD converged after 1794 epoch. It
seems that we should pay some efforts to investigate the correctness of old
MLlib implementation.
> ML LinearSVC vs MLlib SVMWithSGD output different solution
> ----------------------------------------------------------
>
> Key: SPARK-20810
> URL: https://issues.apache.org/jira/browse/SPARK-20810
> Project: Spark
> Issue Type: Question
> Components: ML, MLlib
> Affects Versions: 2.2.0
> Reporter: Yanbo Liang
>
> Fitting with SVM classification model on the same dataset, ML {{LinearSVC}}
> produces different solution compared with MLlib {{SVMWithSGD}}. I understand
> they use different optimization solver (OWLQN vs SGD), does it make sense to
> converge to different solution? Since we use {{sklearn.svm.LinearSVC}} and R
> e1071 SVM as the reference in {{LinearSVCSuite}}, it seems like
> {{SVMWithSGD}} produce wrong solution. Does it also like this?
> AFAIK, both of them use {{hinge loss}} which is convex but not differentiable
> function. Since the derivative of the hinge loss at certain place is
> non-deterministic, should we switch to use {{squared hinge loss}} which is
> the default loss function of {{sklearn.svm.LinearSVC}} and more robust than
> {{hinge loss}}?
> This issue is very easy to reproduce, you can paste the following code
> snippet to {{LinearSVCSuite}} and then click run in Intellij IDE.
> {code}
> test("LinearSVC vs SVMWithSGD") {
> import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
> import org.apache.spark.mllib.classification.SVMWithSGD
> import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
> val trainer1 = new LinearSVC()
> .setRegParam(0.00002)
> .setMaxIter(200)
> .setTol(1e-4)
> val model1 = trainer1.fit(binaryDataset)
> println(model1.coefficients)
> println(model1.intercept)
> val oldData = binaryDataset.rdd.map { case Row(label: Double, features:
> Vector) =>
> OldLabeledPoint(label, OldVectors.fromML(features))
> }
> val trainer2 = new SVMWithSGD().setIntercept(true)
>
> trainer2.optimizer.setRegParam(0.00002).setNumIterations(2000).setConvergenceTol(1e-4)
> val model2 = trainer2.run(oldData)
> println(model2.weights)
> println(model2.intercept)
> }
> {code}
> The output is:
> {code}
> [7.24661385022775,14.774484832179743,22.00945617480461,29.558498069476084]
> 7.373454363024084
> [0.9257083966837497,1.8567843250728242,2.7381537413979595,3.7434319370941265]
> 0.9656577947867953
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]