[ 
https://issues.apache.org/jira/browse/SPARK-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DB Tsai reassigned SPARK-8522:
------------------------------

    Assignee: DB Tsai  (was: holdenk)

> Disable feature scaling in Linear and Logistic Regression
> ---------------------------------------------------------
>
>                 Key: SPARK-8522
>                 URL: https://issues.apache.org/jira/browse/SPARK-8522
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: DB Tsai
>            Assignee: DB Tsai
>             Fix For: 1.5.0
>
>
> All compressed sensing applications, and some of the regression use-cases 
> will have better result by turning the feature scaling off. However, if we 
> implement this naively by training the dataset without doing any 
> standardization, the rate of convergency will not be good. This can be 
> implemented by still standardizing the training dataset but we penalize each 
> component differently to get effectively the same objective function but a 
> better numerical problem. As a result, for those columns with high variances, 
> they will be penalized less, and vice versa. Without this, since all the 
> features are standardized, so they will be penalized the same.
> In R, there is an option for this.
> `standardize` 
> Logical flag for x variable standardization, prior to fitting the model 
> sequence. The coefficients are always returned on the original scale. Default 
> is standardize=TRUE. If variables are in the same units already, you might 
> not wish to standardize. See details below for y standardization with 
> family="gaussian".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to