[
https://issues.apache.org/jira/browse/SPARK-20810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yanbo Liang updated SPARK-20810:
--------------------------------
Description:
Fitting with SVM classification model on the same dataset, ML {{LinearSVC}}
produces different solution compared with MLlib {{SVMWithSGD}}. I understand
they use different optimization solver (OWLQN vs SGD), does it make sense to
converge to different solution?
AFAIK, both of them use Hinge loss which is convex but not differentiable
function. Since the derivative of the hinge loss at certain place is
non-deterministic, should we switch to use squared hinge loss which is the
default loss function of {{sklearn.svm.LinearSVC}}?
This issue is very easy to reproduce, you can paste the following code snippet
to {{LinearSVCSuite}} and then click run in Intellij IDE.
{code}
test("LinearSVC vs SVMWithSGD") {
import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
val trainer1 = new LinearSVC()
.setRegParam(0.00002)
.setMaxIter(200)
.setTol(1e-4)
val model1 = trainer1.fit(binaryDataset)
println(model1.coefficients)
println(model1.intercept)
val oldData = binaryDataset.rdd.map { case Row(label: Double, features:
Vector) =>
OldLabeledPoint(label, OldVectors.fromML(features))
}
val trainer2 = new SVMWithSGD().setIntercept(true)
trainer2.optimizer.setRegParam(0.00002).setNumIterations(200).setConvergenceTol(1e-4)
val model2 = trainer2.run(oldData)
println(model2.weights)
println(model2.intercept)
}
{code}
The output is:
{code}
[7.24661385022775,14.774484832179743,22.00945617480461,29.558498069476084]
7.373454363024084
[0.58166680313823,1.1938960150473041,1.7940106824589588,2.4884300611292165]
0.667790514894194
{code}
was:
Fitting with SVM classification model on the same dataset, ML {{LinearSVC}}
produces different solution compared with MLlib {{SVMWithSGD}}. I understand
they use different optimization solver (OWLQN vs SGD), does it make sense to
converge to different solution?
AFAIK, both of them use Hinge loss which is convex but not differentiable
function. Since the derivative of the hinge loss at certain place is
non-deterministic, should we switch to use squared hinge loss which is the
default loss function of {{sklearn.svm.LinearSVC}}?
This issue is very easy to reproduce, you can paste the following code snippet
to {{LinearSVCSuite}} and then click run in Intellij IDE.
{code}
test("LinearSVC vs SVMWithSGD") {
import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
val trainer1 = new LinearSVC()
.setRegParam(0.00002)
.setMaxIter(200)
.setTol(1e-4)
val model1 = trainer1.fit(binaryDataset)
println(model1.coefficients)
println(model1.intercept)
val oldData = binaryDataset.rdd.map { case Row(label: Double, features:
Vector) =>
OldLabeledPoint(label, OldVectors.fromML(features))
}
val trainer2 = new SVMWithSGD().setIntercept(true)
trainer2.optimizer.setRegParam(0.00002).setNumIterations(200).setConvergenceTol(1e-4)
val model2 = trainer2.run(oldData)
println(model2.weights)
println(model2.intercept)
}
{code}
> ML LinearSVC vs MLlib SVMWithSGD output different solution
> ----------------------------------------------------------
>
> Key: SPARK-20810
> URL: https://issues.apache.org/jira/browse/SPARK-20810
> Project: Spark
> Issue Type: Question
> Components: ML, MLlib
> Affects Versions: 2.2.0
> Reporter: Yanbo Liang
>
> Fitting with SVM classification model on the same dataset, ML {{LinearSVC}}
> produces different solution compared with MLlib {{SVMWithSGD}}. I understand
> they use different optimization solver (OWLQN vs SGD), does it make sense to
> converge to different solution?
> AFAIK, both of them use Hinge loss which is convex but not differentiable
> function. Since the derivative of the hinge loss at certain place is
> non-deterministic, should we switch to use squared hinge loss which is the
> default loss function of {{sklearn.svm.LinearSVC}}?
> This issue is very easy to reproduce, you can paste the following code
> snippet to {{LinearSVCSuite}} and then click run in Intellij IDE.
> {code}
> test("LinearSVC vs SVMWithSGD") {
> import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
> import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
> val trainer1 = new LinearSVC()
> .setRegParam(0.00002)
> .setMaxIter(200)
> .setTol(1e-4)
> val model1 = trainer1.fit(binaryDataset)
> println(model1.coefficients)
> println(model1.intercept)
> val oldData = binaryDataset.rdd.map { case Row(label: Double, features:
> Vector) =>
> OldLabeledPoint(label, OldVectors.fromML(features))
> }
> val trainer2 = new SVMWithSGD().setIntercept(true)
>
> trainer2.optimizer.setRegParam(0.00002).setNumIterations(200).setConvergenceTol(1e-4)
> val model2 = trainer2.run(oldData)
> println(model2.weights)
> println(model2.intercept)
> }
> {code}
> The output is:
> {code}
> [7.24661385022775,14.774484832179743,22.00945617480461,29.558498069476084]
> 7.373454363024084
> [0.58166680313823,1.1938960150473041,1.7940106824589588,2.4884300611292165]
> 0.667790514894194
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]