Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/5055#issuecomment-89003631
  
    @tanyinyan  "Optimization behavior" means convergence rate, yes.
    
    If we scale feature internally and also adjust regularization, then:
    * The optimal solution will not change.  (I agree with you on this.)
    * The optimization behavior will change.  This is because we use a single 
step size for all features.
      * E.g., suppose we have 2 features a and b, where norm(column b) = 1000 * 
norm(column a).  The step size we use needs to be adjusted based on the norm of 
the feature columns; since column b has a really big norm, we will need to use 
a very small step size.  This means we will progress really slowly, especially 
if (a) is the useful feature.
    
    Does that make sense?
    
    We could actually adjust the step size for each feature, rather than 
scaling the data.  Come to think of it, that might be a more efficient solution 
since it's cheaper than creating a new copy of the data.  I'll make a JIRA for 
that since it belongs in another PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to