Re: Logistic Regression Standardization in ML

Yanbo Liang Mon, 10 Oct 2016 09:34:10 -0700

AFAIK, we can guarantee with/without standardization, the models always
converged to the same solution if there is no regularization. You can refer
the test casts at:


https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala#L551


https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala#L588

Thanks
Yanbo

On Mon, Oct 10, 2016 at 7:27 AM, Sean Owen <so...@cloudera.com> wrote:

> (BTW I think it means "when no standardization is applied", which is how
> you interpreted it, yes.) I think it just means that if feature i is
> divided by s_i, then its coefficients in the resulting model will end up
> larger by a factor of s_i. They have to be divided by s_i to put them back
> on the same scale as the unnormalized inputs. I don't think that in general
> it will result in exactly the same model, because part of the point of
> standardizing is to improve convergence. You could propose a rewording of
> the two occurrences of this paragraph if you like.
>
> On Mon, Oct 10, 2016 at 3:15 PM Cesar <ces...@gmail.com> wrote:
>
>>
>> I have a question regarding how the default standardization in the ML
>> version of the Logistic Regression (Spark 1.6) works.
>>
>> Specifically about the next comments in the Spark Code:
>>
>> /**
>> * Whether to standardize the training features before fitting the model.
>> * The coefficients of models will be always returned on the original
>> scale,
>> * so it will be transparent for users. *Note that with/without
>> standardization,*
>> ** the models should be always converged to the same solution when no
>> regularization*
>> ** is applied.* In R's GLMNET package, the default behavior is true as
>> well.
>> * Default is true.
>> *
>> * @group setParam
>> */
>>
>>
>> Specifically I am having issues with understanding why the solution
>> should converge to the same weight values with/without standardization ?
>>
>>
>>
>> Thanks !
>> --
>> Cesar Flores
>>
>

Re: Logistic Regression Standardization in ML

Reply via email to