[GitHub] spark pull request: [SPARK-8845] [ML] ML use of Breeze optimizatio...

jkbradley Tue, 07 Jul 2015 13:41:15 -0700

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7245#discussion_r34087727
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
    @@ -186,39 +186,49 @@ class LogisticRegression(override val uid: String)
         val states = optimizer.iterations(new CachedDiffFunction(costFun),
           initialWeightsWithIntercept.toBreeze.toDenseVector)
     
    -    var state = states.next()
    -    val lossHistory = mutable.ArrayBuilder.make[Double]
    +    val (weights, intercept, lossHistory) = {
    +      /*
    +         Note that in Logistic Regression, the loss is log-likelihood 
which is invariance
    +         under feature standardization. As a result, the loss returned 
from optimizer is
    +         the same as the one in the original space.
    +       */
    +      val arrayBuilder = mutable.ArrayBuilder.make[Double]
    +      var state: optimizer.State = null
    +      while (states.hasNext) {
    +        state = states.next()
    +        arrayBuilder += state.adjustedValue
    +      }
     
    -    while (states.hasNext) {
    -      lossHistory += state.value
    -      state = states.next()
    -    }
    -    lossHistory += state.value
    +      if (state == null) {
    +        val msg = s"${optimizer.getClass.getName} failed."
    +        logError(msg)
    +        throw new SparkException(msg)
    +      }
     
    -    // The weights are trained in the scaled space; we're converting them 
back to
    -    // the original space.
    -    val weightsWithIntercept = {
    +      /*
    +         The weights are trained in the scaled space; we're converting 
them back to
    +         the original space.
    +         Note that the intercept in scaled space and original space is the 
same;
    +         as a result, no scaling is needed.
    +       */
           val rawWeights = state.x.toArray.clone()
           var i = 0
    -      // Note that the intercept in scaled space and original space is the 
same;
    -      // as a result, no scaling is needed.
           while (i < numFeatures) {
             rawWeights(i) *= { if (featuresStd(i) != 0.0) 1.0 / featuresStd(i) 
else 0.0 }
             i += 1
           }
    -      Vectors.dense(rawWeights)
    +
    +      if ($(fitIntercept)) {
    +        (Vectors.dense(rawWeights.slice(0, rawWeights.length - 
1)).compressed,
    +          rawWeights(rawWeights.length - 1), arrayBuilder.result())
    --- End diff --
    
    simpler: ```rawWeights.last```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8845] [ML] ML use of Breeze optimizatio...

Reply via email to