Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Sean Violante
Hi Stuart the underlying logistic regression code in scikit learn (at least for the non liblinear implementation) allows sample weights which would allow you to do what you want. [pass in sample weight Total_Service_Points_Won and target 1 and ( Total_Service_Points_Played-Total_Service_Points_Won

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Stuart Reynolds
Thanks Josef. Was very useful. result.remove_data() reduces a 5 parameter Logit result object from megabytes to 5Kb (as compared to a minimum uncompressed size of the parameters of ~320 bytes). Is big improvement. I'll experiment with what you suggest -- since this is still >10x larger than possib

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread josef . pktd
On Thu, Oct 5, 2017 at 12:34 PM, Stuart Reynolds wrote: > Thanks Josef. Was very useful. > > result.remove_data() reduces a 5 parameter Logit result object from > megabytes to 5Kb (as compared to a minimum uncompressed size of the > parameters of ~320 bytes). Is big improvement. I'll experiment w

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Sean Violante
Stuart have you tried glmnet ( in R) there is a python version https://web.stanford.edu/~hastie/glmnet_python/ On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds wrote: > Thanks Josef. Was very useful. > > result.remove_data() reduces a 5 parameter Logit result object from > megabytes to 5K

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Stuart Reynolds
Turns out sm.Logit does allow setting the tolerance. Some and quick and dirty time profiling of different methods on a 100k * 30 features dataset, with different solvers and losses: sklearn.LogisticRegression: l1 1.13864398003 (seconds) sklearn.LogisticRegression: l2 0.0538778305054 sm.Logit l1 0.

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Stuart Reynolds
Hi Sean, I'll have a look glmnet (looks like its compiled from fortran!). Does it offer much over statsmodel's GLM? This looks great for researchy stuff, although a little less performant. - Stu On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante wrote: > Stuart > have you tried glmnet ( in R) the

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread josef . pktd
On Thu, Oct 5, 2017 at 3:00 PM, Stuart Reynolds wrote: > Hi Sean, > > I'll have a look glmnet (looks like its compiled from fortran!). Does > it offer much over statsmodel's GLM? This looks great for researchy > stuff, although a little less performant. > GLMNet is/wraps the original Fortran imp

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread josef . pktd
On Thu, Oct 5, 2017 at 2:52 PM, Stuart Reynolds wrote: > Turns out sm.Logit does allow setting the tolerance. > Some and quick and dirty time profiling of different methods on a 100k > * 30 features dataset, with different solvers and losses: > > sklearn.LogisticRegression: l1 1.13864398003 (seco

[scikit-learn] question for using GridSearchCV on LocalOutlierFactor

2017-10-05 Thread Lifan Xu
Hi, I was trying to train a model for anomaly detection. I only have the normal data which are all labeled as 1. Here is my code: clf = sklearn.model_selection.GridSearchCV(sklearn.neighbors.LocalOutlierFactor(), parameters, scoring="acc

Re: [scikit-learn] Can fit a model with a target array of probabilities?

2017-10-05 Thread Sean Violante
Stuart, I've only used the R implementation. Glmnet does the warm starts ..in fact they recommend against trying a single regularisation value. And it supports passing a 2d array of positive and negative counts (or multinomial generalisation) My experience is that it is much more accurate than libl