GitHub user dbtsai opened a pull request:
https://github.com/apache/spark/pull/7080
[SPARK-8700][ML] Disable feature scaling in Logistic Regression
All compressed sensing applications, and some of the regression use-cases
will have better result by turning the feature scaling off. However, if we
implement this naively by training the dataset without doing any
standardization, the rate of convergency will not be good. This can be
implemented by still standardizing the training dataset but we penalize each
component differently to get effectively the same objective function but a
better numerical problem. As a result, for those columns with high variances,
they will be penalized less, and vice versa. Without this, since all the
features are standardized, so they will be penalized the same.
In R, there is an option for this.
`standardize`
Logical flag for x variable standardization, prior to fitting the model
sequence. The coefficients are always returned on the original scale. Default
is standardize=TRUE. If variables are in the same units already, you might not
wish to standardize. See details below for y standardization with
family="gaussian".
+cc @holdenk @mengxr @jkbradley
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dbtsai/spark lors
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7080.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7080
----
commit 588c75f714372b6da4dd20fa7d006afe399fa8e2
Author: DB Tsai <[email protected]>
Date: 2015-06-24T01:06:03Z
first commit
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]