Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/10940#issuecomment-190553302
  
    @coderxiang @dbtsai Sorry for late response! I actually thought this PR 
already got merged ... Anyway, I tested `glmnet` and found that `glmnet` 
outputs zero coefficients for constant columns regardless of intercept, 
regularization, and standardization settings. I thought about it today and I 
feel it actually makes sense. If we have a constant column in our training 
data, do we expect it to change or stay constant in test data? If its value 
might change, we should set its coefficient to zero because we cannot estimate 
how big the change would be. If its value stays constant (or maybe users 
created this column to add bias manually), it shouldn't be regularized and 
users should really turn on `fitIntercept` instead. So my suggestion is to 
follow glmnet and set the coefficients of constant columns to zero regardless 
of other settings. If there are constant columns and `fitIntercept` is false. 
We should output a warning message. Does it sound good to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to