On Sat, Oct 3, 2015 at 11:54 PM, George Bezerra <gbeze...@gmail.com> wrote:
> Thanks a lot Josef. I guess it is possible to do what I wanted, though > maybe not in scikit. Does the statsmodels version allow l1 or l2 > regularization? I'm planning to use a lot of features and let the model > decide what is good. > > statsmodels has had L1 regularization for discrete models including Logit for a while. But I don't have much experience with it, and it uses an interior point algorithm. Elastic net for maximum likelihood models using coordinate descend and other penalized maximum likelihood methods like SCAD and structured L2 are in PRs and will be merged over the next months. statsmodels, in contrast to scikit-learn, doesn't have much support for large sparse features. Josef > Thanks again. > > On Sat, Oct 3, 2015 at 11:20 PM, <josef.p...@gmail.com> wrote: > >> Just to come in here as an econometrician and statsmodels maintainer. >> >> statsmodels intentionally doesn't enforce binary data for Logit or >> similar models, any data between 0 and 1 is fine. >> >> Logistic Regression/Logit or similar Binomial/Bernoulli models can >> consistently estimate the expected value (predicted mean) for a continuous >> variable that is between 0 and 1 like a proportion. (Binomial belongs to >> the exponential family where quasi-maximum likelihood method works well.) >> Inference has to be adjusted because a logit model cannot be "true" if >> the data is not binary. >> >> I have somewhere references and examples for this usecase. >> >> statsmodels doesn't do "classification", i.e. hard thresholding, users >> can do it themselves if they need to. >> Which means we leave classification to scikit-learn and only do >> regression, even for funny data, and statsmodels doesn't have methods that >> take advantage of the classification structure of a model. >> >> Josef >> >> >> On Sat, Oct 3, 2015 at 10:50 PM, Sebastian Raschka <se.rasc...@gmail.com> >> wrote: >> >>> Hi, George, >>> logistic regression is a binary classifier by nature (class labels 0 and >>> 1). Scikit-learn supports multi-class classification via One-vs-One or >>> One-vs-All though; and there is a generalization (softmax) that gives you >>> meaningful probabilities for multiple classes (i.e., class probabilities >>> sum up to 1). In any case, logistic regression works with nominal class >>> labels - categorical class labels with no order implied. >>> >>> To keep a long story short: Logistic regression is a classifier, not a >>> regressor — the name is misleading, I agree. I think you may want to look >>> into regression analysis for your continuous target variable. >>> >>> Best, >>> Sebastian >>> >>> > On Oct 3, 2015, at 9:58 PM, George Bezerra <gbeze...@gmail.com> wrote: >>> > >>> > Hi there, >>> > >>> > I would like to train a logistic regression model on a continuous >>> (i.e., not categorical) target variable. The target is a probability, which >>> is why I am using a logistic regression for this problem. However, the >>> sklearn function tries to find the class labels by running a unique() on >>> the target values, which is disastrous if y is continuous. >>> > >>> > Is there a way to train logistic regression on a continuous target >>> variable in sklearn? >>> > >>> > Any help is highly appreciated. >>> > >>> > Best, >>> > >>> > George. >>> > >>> > -- >>> > George Bezerra >>> > >>> ------------------------------------------------------------------------------ >>> > _______________________________________________ >>> > Scikit-learn-general mailing list >>> > Scikit-learn-general@lists.sourceforge.net >>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> > > > -- > George Bezerra > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general