[
https://issues.apache.org/jira/browse/SPARK-26173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Facundo Bellosi updated SPARK-26173:
------------------------------------
Description:
This feature enables Maximum A Posteriori (MAP) optimization for Logistic
Regression based on a Gaussian prior. In practice, this is just implementing a
more general form of L2 regularization parameterized by a (multivariate) mean
and precisions (inverse of variance) vectors.
Prior regularization is calculated through the following formula:
!Prior regularization.png!
where:
* λ: regularization parameter ({{regParam}})
* K: number of coefficients (weights vector length)
* w~i~ ~ Normal(μ~i~, β~i~^2^)
_Reference: Bishop, Christopher M. (2006). Pattern Recognition and Machine
Learning (section 4.5). Berlin, Heidelberg: Springer-Verlag._
h2. Implementation
* 2 new parameters added to {{LogisticRegression}}: {{priorMean}} and
{{priorPrecisions}}.
* 1 new class ({{PriorRegularization}}) implements the calculations of the
value and gradient of the prior regularization term.
* Prior regularization is enabled when both vectors are provided and
{{regParam}} > 0 and {{elasticNetParam}} < 1.
h2. Tests
* {{DifferentiableRegularizationSuite}}
** {{Prior regularization}}
* {{LogisticRegressionSuite}}
** {{prior precisions should be required when prior mean is set}}
** {{prior mean should be required when prior precisions is set}}
** {{`regParam` should be positive when using prior regularization}}
** {{`elasticNetParam` should be less than 1.0 when using prior
regularization}}
** {{prior mean and precisions should have equal length}}
** {{priors' length should match number of features}}
** {{binary logistic regression with prior regularization equivalent to L2}}
** {{binary logistic regression with prior regularization equivalent to L2
(bis)}}
** {{binary logistic regression with prior regularization}}
was:
This feature enables Maximum A Posteriori (MAP) optimization for Logistic
Regression based on a Gaussian prior. In practice, this is just implementing a
more general form of L2 regularization parameterized by a (multivariate) mean
and precisions vectors.
_Reference: Bishop, Christopher M. (2006). Pattern Recognition and Machine
Learning (section 4.5). Berlin, Heidelberg: Springer-Verlag._
h2. Implementation
* 2 new parameters added to {{LogisticRegression}}: {{priorMean}} and
{{priorPrecisions}}.
* 1 new class ({{PriorRegularization}}) implements the calculations of the
value and gradient of the prior regularization term.
* Prior regularization is enabled when both vectors are provided and
{{regParam}} > 0 and {{elasticNetParam}} < 1.
h2. Tests
* {{DifferentiableRegularizationSuite}}
** {{Prior regularization}}
* {{LogisticRegressionSuite}}
** {{prior precisions should be required when prior mean is set}}
** {{prior mean should be required when prior precisions is set}}
** {{`regParam` should be positive when using prior regularization}}
** {{`elasticNetParam` should be less than 1.0 when using prior regularization}}
** {{prior mean and precisions should have equal length}}
** {{priors' length should match number of features}}
** {{binary logistic regression with prior regularization equivalent to L2}}
** {{binary logistic regression with prior regularization equivalent to L2
(bis)}}
** {{binary logistic regression with prior regularization}}
> Prior regularization for Logistic Regression
> --------------------------------------------
>
> Key: SPARK-26173
> URL: https://issues.apache.org/jira/browse/SPARK-26173
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 2.4.0
> Reporter: Facundo Bellosi
> Priority: Minor
> Attachments: Prior regularization.png
>
>
> This feature enables Maximum A Posteriori (MAP) optimization for Logistic
> Regression based on a Gaussian prior. In practice, this is just implementing
> a more general form of L2 regularization parameterized by a (multivariate)
> mean and precisions (inverse of variance) vectors.
> Prior regularization is calculated through the following formula:
> !Prior regularization.png!
> where:
> * λ: regularization parameter ({{regParam}})
> * K: number of coefficients (weights vector length)
> * w~i~ ~ Normal(μ~i~, β~i~^2^)
> _Reference: Bishop, Christopher M. (2006). Pattern Recognition and Machine
> Learning (section 4.5). Berlin, Heidelberg: Springer-Verlag._
> h2. Implementation
> * 2 new parameters added to {{LogisticRegression}}: {{priorMean}} and
> {{priorPrecisions}}.
> * 1 new class ({{PriorRegularization}}) implements the calculations of the
> value and gradient of the prior regularization term.
> * Prior regularization is enabled when both vectors are provided and
> {{regParam}} > 0 and {{elasticNetParam}} < 1.
> h2. Tests
> * {{DifferentiableRegularizationSuite}}
> ** {{Prior regularization}}
> * {{LogisticRegressionSuite}}
> ** {{prior precisions should be required when prior mean is set}}
> ** {{prior mean should be required when prior precisions is set}}
> ** {{`regParam` should be positive when using prior regularization}}
> ** {{`elasticNetParam` should be less than 1.0 when using prior
> regularization}}
> ** {{prior mean and precisions should have equal length}}
> ** {{priors' length should match number of features}}
> ** {{binary logistic regression with prior regularization equivalent to L2}}
> ** {{binary logistic regression with prior regularization equivalent to L2
> (bis)}}
> ** {{binary logistic regression with prior regularization}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]