[
https://issues.apache.org/jira/browse/SPARK-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-21643.
-------------------------------
Resolution: Invalid
This isn't narrowed down nearly enough to be a JIRA. It's not even clear
there's a problem as you just get a different number of iterations.
> LR dataset worked in Spark 1.6.3, 2.0.2 stopped working in 2.1.0 onward
> -----------------------------------------------------------------------
>
> Key: SPARK-21643
> URL: https://issues.apache.org/jira/browse/SPARK-21643
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.1.0, 2.1.1, 2.2.0
> Environment: CentOS 7, 256G memory, and 52 CPUs VM
> Reporter: Thomas Kwan
>
> This dataset is working on 1.6.x, and 2.0.x. But it is not converging with
> 2.1+
> a) Download the data set
> (https://s3.amazonaws.com/manage-partners/pipeline/di873-train.json.gz) and
> uncompress it, i placed it /tmp/di873-train.json
> b) Download the spark package to /usr/lib/spark/spark-*
> c) cd sbin
> d) start-master.sh
> e) start-slave.sh <master-url>
> f) cd ../bin
> g) Start spark-shell <master-url>
> h) I pasted in the following scala cods:
> import org.apache.spark.sql.types._
> val VT = org.apache.spark.ml.linalg.SQLDataTypes.VectorType
> val schema = StructType(Array(StructField("features",
> VT,true),StructField("label",DoubleType,true)))
> val df = spark.read.schema(schema).json("file:///tmp/di873-train.json")
> val trainer = new
> org.apache.spark.ml.classification.LogisticRegression().setMaxIter(500).setElasticNetParam(1.0).setRegParam(0.00001).setTol(0.00001).setFitIntercept(true)
> val model = trainer.fit(df)
> i) Then I monitored the progress in the Spark UI under the Jobs tab.
> With Spark 1.6.1, Spark 2.0.2, the training (treeAggregate), the training
> finished around 25-30 jobs. But with 2.1+, the trainings were not converging
> and the training were finished only because they hitted the max iterations
> (i.e. 500).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]