Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread adnan rajper
Thanks millions to Paolo, Gael and everybody. Adnan -- Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops

Re: [Scikit-learn-general] Plotting training to evaluate the bias / variance regime

2011-12-30 Thread Olivier Grisel
2011/12/30 Gilles Louppe : >> It seems to be an interesting tool to me. We need to find a >> non-trivial overfitting example that would run in an acceptable time >> with the datasets available in the scikit. > > Actually, those curves can be plot with respect to any parameter, not > only the traini

Re: [Scikit-learn-general] Plotting training to evaluate the bias / variance regime

2011-12-30 Thread Gilles Louppe
> It seems to be an interesting tool to me. We need to find a > non-trivial overfitting example that would run in an acceptable time > with the datasets available in the scikit. Actually, those curves can be plot with respect to any parameter, not only the training set size. What comes to me is t

Re: [Scikit-learn-general] Plotting training to evaluate the bias / variance regime

2011-12-30 Thread Olivier Grisel
2011/12/28 Nick Wilson : > On Tue, Dec 27, 2011 at 6:23 PM, Olivier Grisel > wrote: >> Hi all, >> >> I came across the following blog post about Andrew Ng's ML class and I >> like the training / validation errors plots to find out whether the >> model is too biased (underfitting) or two lax (high

Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread Paolo Losi
Hi Gael, On Thu, Dec 29, 2011 at 11:06 PM, Gael Varoquaux < [email protected]> wrote: > > To the other developers: is their a reason/difficulty for not having > Platt's method (implemented for SVC, AFAIK) for LinearSVC? > I've got a "draft" Platt's calibration implementation on a bran

[Scikit-learn-general] Parallel forest: call for review

2011-12-30 Thread Gilles Louppe
Hi list, This is a call to get an additional person (or more) to review the pending PR #491 on parallel forest of trees. It has already been reviewed by @ogrisel and look ready to merged for the both of us, but an additional review would be more than welcome! https://github.com/scikit-learn/scik

Re: [Scikit-learn-general] using string features for classification

2011-12-30 Thread Olivier Grisel
In the previous mail variable `X` should be replaced by `data`. -- Olivier -- Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure acce

Re: [Scikit-learn-general] using string features for classification

2011-12-30 Thread Olivier Grisel
2011/12/30 Bronco Zaurus : > Thank you for all the answers. Yes, I'm not dealing with arbitrary strings, > just a set of possible values, so the binary representation seems OK. Alright, then the name of this kind of features is "categorical features" in machine learning jargon: the string is used

Re: [Scikit-learn-general] using string features for classification

2011-12-30 Thread Bronco Zaurus
Thank you for all the answers. Yes, I'm not dealing with arbitrary strings, just a set of possible values, so the binary representation seems OK. One more way would be computing classification probability for each value and plugging the resulting number back into data. For example, let's say there

Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread Gael Varoquaux
On Fri, Dec 30, 2011 at 11:28:39AM +0100, Andreas Mueller wrote: > It might be that I haven't really understood the meaning of ROC > curves, but I thought it worked like @ogrisel said. > Whatever the correct method to produce a ROC curve > from a linear classifier, I'm pretty sure that using the de

Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread Andreas Mueller
On 12/30/2011 10:15 AM, Gael Varoquaux wrote: > On Fri, Dec 30, 2011 at 10:09:59AM +0100, Olivier Grisel wrote: >>> * You could use the decision function, (decision_function method of the >>>LinearSVC) although this is clearly a hack. >> Why is this a hack? ROC is only concerned with the rela

Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread Gael Varoquaux
On Fri, Dec 30, 2011 at 10:09:59AM +0100, Olivier Grisel wrote: > >  * You could use the decision function, (decision_function method of the > >   LinearSVC) although this is clearly a hack. > Why is this a hack? ROC is only concerned with the relative positions > of the decision threshold, not th

Re: [Scikit-learn-general] ROC curve

2011-12-30 Thread Olivier Grisel
2011/12/29 Gael Varoquaux : > On Thu, Dec 29, 2011 at 12:46:36PM -0800, adnan rajper wrote: >>    I use LinearSVC for text classification. My problem is that I want to >>    generate ROC curve for LinearSVC. Since LinearSVC does not output >>    probabilties. Is there any other way to  generate ROC