DB Tsai created SPARK-5127:
------------------------------

             Summary: Fixed overflow when there are outliers in data in 
Logistic Regression
                 Key: SPARK-5127
                 URL: https://issues.apache.org/jira/browse/SPARK-5127
             Project: Spark
          Issue Type: Bug
          Components: MLlib
            Reporter: DB Tsai


gradientMultiplier = (1.0 / (1.0 + math.exp(margin))) - label

However, the first part of gradientMultiplier will be suffered from overflow if 
there are samples far away from hyperplane, and this happens when there are 
outliers in data. As a result, we use the equivalent formula but more 
numerically stable.

    val gradientMultiplier =
      if (margin > 0.0) {
        val temp = math.exp(-margin)
        temp / (1.0 + temp) - label
      } else {
        1.0 / (1.0 + math.exp(margin)) - label
      }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to