ykerzhner commented on a change in pull request #31693:
URL: https://github.com/apache/spark/pull/31693#discussion_r587041625
##########
File path:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
##########
@@ -971,6 +975,10 @@ class LogisticRegression @Since("1.2.0") (
}
if (fitWithMean) {
+ // orginal initialCoefWithInterceptArray is for problem:
Review comment:
As I mentioned in my review, I think it is a bad idea to make this
adjustment to the starting intercept. If you look at the current code that
creates the initial intercept:
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala#L893-L907
the intercept is set to equal the log(odds) of the data. This is actually
the optimal starting intercept for centered data (and not a great starting
point for non-centered data). So doing the adjustment here is actually counter
productive.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]