Re: [scikit-learn] logistic regression results are not stable between solvers

Guillaume Lemaître Wed, 09 Oct 2019 14:41:04 -0700

Uhm actually increasing to 10000 samples solve the convergence issue.
SAGA is not designed to work with a so small sample size most probably.


On Wed, 9 Oct 2019 at 23:36, Guillaume Lemaître <[email protected]>
wrote:

> I slightly change the bench such that it uses pipeline and plotted the
> coefficient:
>
> https://gist.github.com/glemaitre/8fcc24bdfc7dc38ca0c09c56e26b9386
>
> I only see one of the 10 splits where SAGA is not converging, otherwise
> the coefficients
> look very close (I don't attach the figure here but they can be plotted
> using the snippet).
> So apart from this second split, the other differences seems to be
> numerical instability.
>
> Where I have some concern is regarding the convergence rate of SAGA but I
> have no
> intuition to know if this is normal or not.
>
> On Wed, 9 Oct 2019 at 23:22, Roman Yurchak <[email protected]> wrote:
>
>> Ben,
>>
>> I can confirm your results with penalty='none' and C=1e9. In both cases,
>> you are running a mostly unpenalized logisitic regression. Usually
>> that's less numerically stable than with a small regularization,
>> depending on the data collinearity.
>>
>> Running that same code with
>>   - larger penalty ( smaller C values)
>>   - or larger number of samples
>>   yields for me the same coefficients (up to some tolerance).
>>
>> You can also see that SAGA convergence is not good by the fact that it
>> needs 196000 epochs/iterations to converge.
>>
>> Actually, I have often seen convergence issues with SAG on small
>> datasets (in unit tests), not fully sure why.
>>
>> --
>> Roman
>>
>> On 09/10/2019 22:10, serafim loukas wrote:
>> > The predictions across solver are exactly the same when I run the code.
>> > I am using 0.21.3 version. What is yours?
>> >
>> >
>> > In [13]: import sklearn
>> >
>> > In [14]: sklearn.__version__
>> > Out[14]: '0.21.3'
>> >
>> >
>> > Serafeim
>> >
>> >
>> >
>> >> On 9 Oct 2019, at 21:44, Benoît Presles <[email protected]
>> >> <mailto:[email protected]>> wrote:
>> >>
>> >> (y_pred_lbfgs==y_pred_saga).all() == False
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > [email protected]
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>>
>> _______________________________________________
>> scikit-learn mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
> Guillaume Lemaitre
> Scikit-learn @ Inria Foundation
> https://glemaitre.github.io/
>


-- 
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] logistic regression results are not stable between solvers

Reply via email to