Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5767#discussion_r29374063
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
    @@ -110,59 +110,70 @@ class LinearRegression extends Regressor[Vector, 
LinearRegression, LinearRegress
         val yMean = statCounter.mean
         val yStd = math.sqrt(statCounter.variance)
     
    -    val featuresMean = summarizer.mean.toArray
    -    val featuresStd = summarizer.variance.toArray.map(math.sqrt)
    -
    -    // Since we implicitly do the feature scaling when we compute the cost 
function
    -    // to improve the convergence, the effective regParam will be changed.
    -    val effectiveRegParam = paramMap(regParam) / yStd
    -    val effectiveL1RegParam = paramMap(elasticNetParam) * effectiveRegParam
    -    val effectiveL2RegParam = (1.0 - paramMap(elasticNetParam)) * 
effectiveRegParam
    -
    -    val costFun = new LeastSquaresCostFun(instances, yStd, yMean,
    -      featuresStd, featuresMean, effectiveL2RegParam)
    -
    -    val optimizer = if (paramMap(elasticNetParam) == 0.0 || 
effectiveRegParam == 0.0) {
    -      new BreezeLBFGS[BDV[Double]](paramMap(maxIter), 10, paramMap(tol))
    -    } else {
    -      new BreezeOWLQN[Int, BDV[Double]](paramMap(maxIter), 10, 
effectiveL1RegParam, paramMap(tol))
    -    }
    +    // If the yStd is zero, then the intercept is yMean with zero weights;
    +    // as a result, training is not needed.
    +    val model = if (yStd != 0.0) {
    --- End diff --
    
    Is it really rare? If `yStd` is `0.0` and then the optimal model would be 
empty with intercept `yMean`. In this case, a warning would be proper. Having 
this giant `if ... else` block making the code hard to read.
    
    ~~~scala
    if (yStd == 0.0) {
      logWarning(...)
      if (handlePersistence) ...
      return new LinearRegressionModel(...)
    }
    
    // actual implementation
    ~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to