zhengruifeng commented on issue #27396: [SPARK-30660][ML][PYSPARK] LinearRegression blockify input vectors URL: https://github.com/apache/spark/pull/27396#issuecomment-580120260 testCode: ```scala import org.apache.spark.ml.regression._ import org.apache.spark.storage.StorageLevel val df = spark.read.format("libsvm").load("/data1/Datasets/a9a/a9a") df.persist(StorageLevel.MEMORY_AND_DISK) df.count new LinearRegression().setMaxIter(10).fit(df) val lr1 = new LinearRegression().setSolver("l-bfgs").setLoss("squaredError").setMaxIter(100) val start = System.currentTimeMillis; val model1 = lr1.fit(df); val end = System.currentTimeMillis; end - start val lr2 = new LinearRegression().setSolver("l-bfgs").setLoss("huber").setMaxIter(100) val start = System.currentTimeMillis; val model2 = lr2.fit(df); val end = System.currentTimeMillis; end - start Seq(model1, model2).map(_.summary.totalIterations) Seq(model1, model2).map(_.summary.objectiveHistory.last) ``` Result: This PR: Duration: 1598, 2473 last objective: 0.30658886019384274, 0.5991272847846535 numIteration: 40, 101 Master: Duration: 2060, 5015 last objective: 0.306588860193839, 0.5985319305006285 numIteration: 40, 101
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
