GitHub user dbtsai opened a pull request:

    https://github.com/apache/spark/pull/7080

    [SPARK-8700][ML] Disable feature scaling in Logistic Regression

    All compressed sensing applications, and some of the regression use-cases 
will have better result by turning the feature scaling off. However, if we 
implement this naively by training the dataset without doing any 
standardization, the rate of convergency will not be good. This can be 
implemented by still standardizing the training dataset but we penalize each 
component differently to get effectively the same objective function but a 
better numerical problem. As a result, for those columns with high variances, 
they will be penalized less, and vice versa. Without this, since all the 
features are standardized, so they will be penalized the same.
    
    In R, there is an option for this.
    `standardize`       
    Logical flag for x variable standardization, prior to fitting the model 
sequence. The coefficients are always returned on the original scale. Default 
is standardize=TRUE. If variables are in the same units already, you might not 
wish to standardize. See details below for y standardization with 
family="gaussian".
    
    +cc @holdenk @mengxr @jkbradley 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dbtsai/spark lors

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7080.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7080
    
----
commit 588c75f714372b6da4dd20fa7d006afe399fa8e2
Author: DB Tsai <[email protected]>
Date:   2015-06-24T01:06:03Z

    first commit

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to