[GitHub] spark pull request: [SPARK-13372] [ML] Fix LogisticRegression when...

dbtsai Wed, 24 Feb 2016 11:58:48 -0800

Github user dbtsai commented on the pull request:

    https://github.com/apache/spark/pull/11247#issuecomment-188428112
  
    @yanboliang I share the same concern with you. However, user may have 
`standardization = false`, but still want to have a good convergency when the 
scales are quite different. For example, you can test it out that scaling one 
column by 100x, the corresponding coefficient should be shrunk by 100x. This 
can not be achieved without this trick. In R's GLMNET, they do the same trick 
to ensure this property. Although it may be confusing for users, since it's 
transparent to users, I think it's still okay. 
    
    As for your second point, in `LogisticRegressionWithLBFGS`, when 
`standardization = false` and `regParam = 0.0`, the solution will be identical 
to `standardization = true` and `regParam = 0.0`, so the users still can get 
the correct answer. The breaking change in `LogisticRegressionWithLBFGS` is 
mainly solving the issue of regularizing the intercept, and 
https://github.com/apache/spark/pull/10788/files#diff-c78e117e05337bd8f7151ddf9450047dL402
 is just the side effect that we handle standardization better so the 
convergency is better for the problem with different scale when  
`standardization = false`.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13372] [ML] Fix LogisticRegression when...

Reply via email to