[jira] [Resolved] (SPARK-22555) Possibly incorrect scaling of L2 regularization strength in LinearRegression

Hyukjin Kwon (JIRA) Mon, 20 May 2019 21:41:49 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-22555.
----------------------------------
    Resolution: Incomplete

> Possibly incorrect scaling of L2 regularization strength in LinearRegression
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-22555
>                 URL: https://issues.apache.org/jira/browse/SPARK-22555
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Andrew Crosby
>            Priority: Minor
>              Labels: bulk-closed
>
> According to the Spark documentation, the linear regression estimator 
> minimizes the regularized sum of squares:
> 1/N Sum(y - w x)^2^ + λ( (1-α) |w|~2~ + α |w|~1~ )
> Under the hood, in order to improve convergence, the optimization algorithms 
> actually work in scaled space using the variables y' = y / σ ~y~, x' = x / σ 
> ~x~ and w' = w / (σ ~x~ / σ ~y~). In terms of these scaled variables, the 
> above expression becomes:
> σ ~y~^2^ ( 1/N  Sum(y' - w' x')^2^ + λ( (1-α) / σ ~x~^2^ |w'|~2~ + α / (σ ~x~ 
> σ ~y~) |w'|~1~ ) )
> The solution in scaled space is equivalent to the original problem, provided 
> that the regularization strengths are suitably adjusted. The effective L1 
> regularization strength should be λ α / (σ ~x~ σ ~y~) and the effective L2 
> regularization strength should be λ (1-α) / σ ~x~^2^.
> However, this doesn't quite match the regularization strengths that are 
> actually used. While the factors of σ ~x~ are correctly included (or 
> correctly ommitted if the standardization parameter is set), it appears that 
> the 1 / σ ~y~ scaling is applied to both the L1 and L2 regularization 
> parameters instead of just to the L1 regularization parameter. Both 
> LinearRegression.scala and WeightedLeastSquares.scala contain code along the 
> following lines:
> {code}
> val effectiveRegParam = $(regParam) / yStd
> val effectiveL1RegParam = $(elasticNetParam) * effectiveRegParam
> val effectiveL2RegParam = (1.0 - $(elasticNetParam)) * effectiveRegParam
> {code}
> Admittedly, the unit tests confirm that the current behaviour matches that of 
> R's glmnet, it just doesn't seem to match the behaviour claimed in the 
> documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-22555) Possibly incorrect scaling of L2 regularization strength in LinearRegression

Reply via email to