Re: [scikit-learn] logistic regression results are not stable between solvers

Benoît Presles Wed, 08 Jan 2020 11:53:58 -0800

Dear sklearn users,

I still have some issues concerning logistic regression.

I did compare on the same data (simulated data) sklearn with threedifferent solvers (lbfgs, saga, liblinear) and statsmodels.

When everything goes well, I get the same results between lbfgs, saga,liblinear and statsmodels. When everything goes wrong, all the resultsare different.

In fact, when everything goes wrong, statsmodels gives me a convergencewarning (Warning: Maximum number of iterations has been exceeded.Current function value: inf Iterations: 20000) + an error(numpy.linalg.LinAlgError: Singular matrix).

Why sklearn does not tell me anything? How can I know that I haveconvergence issues with sklearn?



Thanks for your help,
Best regards,
Ben

--------------------------------------------

Here is the code I used to generate synthetic data:

from sklearn.datasets import make_classification
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import statsmodels.api as sm
#
RANDOM_SEED = 2
#
X_sim, y_sim = make_classification(n_samples=200,
                           n_features=20,
                           n_informative=10,
                           n_redundant=0,
                           n_repeated=0,
                           n_classes=2,
                           n_clusters_per_class=1,
                           random_state=RANDOM_SEED,
                           shuffle=False)
#

sss = StratifiedShuffleSplit(n_splits=10, test_size=0.2,random_state=RANDOM_SEED)

for train_index_split, test_index_split in sss.split(X_sim, y_sim):

X_split_train, X_split_test = X_sim[train_index_split],X_sim[test_index_split] y_split_train, y_split_test = y_sim[train_index_split],y_sim[test_index_split]

    ss = StandardScaler()
    X_split_train = ss.fit_transform(X_split_train)
    X_split_test = ss.transform(X_split_test)
    #

classifier_lbfgs = LogisticRegression(fit_intercept=True,max_iter=20000000, verbose=0, random_state=RANDOM_SEED, C=1e9, solver='lbfgs', penalty='none',tol=1e-6)

    classifier_lbfgs.fit(X_split_train, y_split_train)
    print('classifier lbfgs iter:',  classifier_lbfgs.n_iter_)
    print(classifier_lbfgs.intercept_)
    print(classifier_lbfgs.coef_)
    #

classifier_saga = LogisticRegression(fit_intercept=True,max_iter=20000000, verbose=0, random_state=RANDOM_SEED, C=1e9, solver='saga', penalty='none',tol=1e-6)

    classifier_saga.fit(X_split_train, y_split_train)
    print('classifier saga iter:', classifier_saga.n_iter_)
    print(classifier_saga.intercept_)
    print(classifier_saga.coef_)
    #

classifier_liblinear = LogisticRegression(fit_intercept=True,max_iter=20000000, verbose=0, random_state=RANDOM_SEED,

                                         C=1e9,

solver='liblinear',penalty='l2', tol=1e-6)

    classifier_liblinear.fit(X_split_train, y_split_train)
    print('classifier liblinear iter:', classifier_liblinear.n_iter_)
    print(classifier_liblinear.intercept_)
    print(classifier_liblinear.coef_)
    # statsmodels
    logit = sm.Logit(y_split_train, sm.tools.add_constant(X_split_train))
    logit_res = logit.fit(maxiter=20000)
    print("Coef statsmodels")
    print(logit_res.params)



On 11/10/2019 15:42, Andreas Mueller wrote:



On 10/10/19 1:14 PM, Benoît Presles wrote:

Thanks for your answers.
On my real data, I do not have so many samples. I have a bit morethan 200 samples in total and I also would like to get some resultswith unpenalized logisitic regression.
What do you suggest? Should I switch to the lbfgs solver?

Yes.

Am I sure that with this solver I will not have any convergence issueand always get the good result? Indeed, I did not get any convergencewarning with saga, so I thought everything was fine. I noticed someissues only when I decided to test several solvers. Without comparingthe results across solvers, how to be sure that the optimisation goeswell? Shouldn't scikit-learn warn the user somehow if it is not the case?

We should attempt to warn in the SAGA solver if it doesn't converge.That it doesn't raise a convergence warning should probably beconsidered a bug.

It uses the maximum weight change as a stopping criterion right now.

We could probably compute the dual objective once in the end to see ifwe converged, right? Or is that not possible with SAGA? If not, wemight want to caution that no convergence warning will be raised.

At last, I was using saga because I also wanted to do some featureselection by using l1 penalty which is not supported by lbfgs...

You can use liblinear then.


Best regards,
Ben


Le 09/10/2019 à 23:39, Guillaume Lemaître a écrit :

Ups I did not see the answer of Roman. Sorry about that. It iscoming back to the same conclusion :)

On Wed, 9 Oct 2019 at 23:37, Guillaume Lemaître<g.lemaitr...@gmail.com <mailto:g.lemaitr...@gmail.com>> wrote:


    Uhm actually increasing to 10000 samples solve the convergence
    issue.
    SAGA is not designed to work with a so small sample size most
    probably.

    On Wed, 9 Oct 2019 at 23:36, Guillaume Lemaître
    <g.lemaitr...@gmail.com <mailto:g.lemaitr...@gmail.com>> wrote:

        I slightly change the bench such that it uses pipeline and
        plotted the coefficient:

        https://gist.github.com/glemaitre/8fcc24bdfc7dc38ca0c09c56e26b9386

        I only see one of the 10 splits where SAGA is not
        converging, otherwise the coefficients
        look very close (I don't attach the figure here but they can
        be plotted using the snippet).
        So apart from this second split, the other differences seems
        to be numerical instability.

        Where I have some concern is regarding the convergence rate
        of SAGA but I have no
        intuition to know if this is normal or not.

        On Wed, 9 Oct 2019 at 23:22, Roman Yurchak
        <rth.yurc...@gmail.com <mailto:rth.yurc...@gmail.com>> wrote:

            Ben,

            I can confirm your results with penalty='none' and
            C=1e9. In both cases,
            you are running a mostly unpenalized logisitic
            regression. Usually
            that's less numerically stable than with a small
            regularization,
            depending on the data collinearity.

            Running that same code with
              - larger penalty ( smaller C values)
              - or larger number of samples
              yields for me the same coefficients (up to some
            tolerance).

            You can also see that SAGA convergence is not good by
            the fact that it
            needs 196000 epochs/iterations to converge.

            Actually, I have often seen convergence issues with SAG
            on small
            datasets (in unit tests), not fully sure why.

--Roman


            On 09/10/2019 22:10, serafim loukas wrote:
            > The predictions across solver are exactly the same
            when I run the code.
            > I am using 0.21.3 version. What is yours?
            >
            >
            > In [13]: import sklearn
            >
            > In [14]: sklearn.__version__
            > Out[14]: '0.21.3'
            >
            >
            > Serafeim
            >
            >
            >
            >> On 9 Oct 2019, at 21:44, Benoît Presles
            <benoit.pres...@u-bourgogne.fr
            <mailto:benoit.pres...@u-bourgogne.fr>
            >> <mailto:benoit.pres...@u-bourgogne.fr
            <mailto:benoit.pres...@u-bourgogne.fr>>> wrote:
            >>
            >> (y_pred_lbfgs==y_pred_saga).all() == False
            >
            >
            > _______________________________________________
            > scikit-learn mailing list
            > scikit-learn@python.org <mailto:scikit-learn@python.org>
            > https://mail.python.org/mailman/listinfo/scikit-learn
            >

            _______________________________________________
            scikit-learn mailing list
            scikit-learn@python.org <mailto:scikit-learn@python.org>
            https://mail.python.org/mailman/listinfo/scikit-learn

--Guillaume Lemaitre

        Scikit-learn @ Inria Foundation
        https://glemaitre.github.io/

--Guillaume Lemaitre

    Scikit-learn @ Inria Foundation
    https://glemaitre.github.io/



--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn



_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] logistic regression results are not stable between solvers

Reply via email to