[Scikit-learn-general] predict_proba for LinearSVC and platt scaling

2014-02-19 Thread Joseph Perla
LinearSVC does not have predict_proba() and predict_log_proba() implementations, but SVC does. This is because liblinear does not have calibrated probabilities as an option but libsvm does. Would it be okay if I add a classifier mixin to core that adds Platt Scaling into the LinearSVC? I already

Re: [Scikit-learn-general] Logistic regression coefficients analysis

2014-02-19 Thread Tobias Günther
Sounds like you're on the right path. Looking at the misclassified documents and the feature coefficients is a common way to debug a classifier, especially if you use boolean features. If you're using a sklearn vectorizer this might be of interest to you: http://stackoverflow.com/questions/669

Re: [Scikit-learn-general] Logistic regression coefficients analysis

2014-02-19 Thread Joel Nothman
It is correct to assume that a positive coefficient contributes positively to a decision. However, because the features are interdependent, the raw strength of a feature isn't always straightforward to interpret. For example, it might give a big positive coefficient to "Tel" and a similar negative

[Scikit-learn-general] Logistic regression coefficients analysis

2014-02-19 Thread Pavel Soriano
Hello scikit! I need some insights into what I am doing. Currently I am doing a text classifier (2 classes) using unigrams (word level) and some writing style features. I am using a Logistic Regression model, with L1 regularization. I have a decent performance (around .70 f-measure) for the given

Re: [Scikit-learn-general] python 3.x

2014-02-19 Thread Olivier Grisel
I confirm Python 3.3 is fully supported and that this is a numpy related warning. PR to fix such warnings are always welcome. -- Olivier -- Managing the Performance of Cloud-Based Applications Take advantage of what the