[
https://issues.apache.org/jira/browse/SPARK-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
DB Tsai resolved SPARK-8522.
----------------------------
Resolution: Fixed
Fix Version/s: 1.5.0
Issue resolved by pull request 7024
[https://github.com/apache/spark/pull/7024]
> Disable feature scaling in Linear and Logistic Regression
> ---------------------------------------------------------
>
> Key: SPARK-8522
> URL: https://issues.apache.org/jira/browse/SPARK-8522
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Reporter: DB Tsai
> Assignee: holdenk
> Fix For: 1.5.0
>
>
> All compressed sensing applications, and some of the regression use-cases
> will have better result by turning the feature scaling off. However, if we
> implement this naively by training the dataset without doing any
> standardization, the rate of convergency will not be good. This can be
> implemented by still standardizing the training dataset but we penalize each
> component differently to get effectively the same objective function but a
> better numerical problem. As a result, for those columns with high variances,
> they will be penalized less, and vice versa. Without this, since all the
> features are standardized, so they will be penalized the same.
> In R, there is an option for this.
> `standardize`
> Logical flag for x variable standardization, prior to fitting the model
> sequence. The coefficients are always returned on the original scale. Default
> is standardize=TRUE. If variables are in the same units already, you might
> not wish to standardize. See details below for y standardization with
> family="gaussian".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]