Re: [Scikit-learn-general] TF-Idf

2012-09-22 Thread Olivier Grisel
2012/9/22 Ark ark_an...@yahoo.com: Hello, I am trying to classify a large document set with LinearSVC. I get good accuracy. However I was wondering how to optimize the interface to this classifier. For e.g.If I have an predict interface that accepts the raw document, You can use the

[Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Olivier Grisel
and to Andreas who finished in the 6th position out of 50 final submitters. This contest was about text classification: http://www.kaggle.com/c/detecting-insults-in-social-commentary Any feedback on what scikit-learn models where used, which feature extraction / blending techniques were

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Andreas Mueller
Congratulations to Vivek also from me :) On 09/22/2012 12:17 PM, Olivier Grisel wrote: and to Andreas who finished in the 6th position out of 50 final submitters. Thanks Olivier. I'll write a short blog post, but my best model is pretty boring :-/ There was a pretty big gap between the first

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Andreas Mueller
On 09/22/2012 12:17 PM, Olivier Grisel wrote: and to Andreas who finished in the 6th position out of 50 final submitters. This contest was about text classification: http://www.kaggle.com/c/detecting-insults-in-social-commentary Any feedback on what scikit-learn models where used, which

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Olivier Grisel
2012/9/22 Andreas Mueller amuel...@ais.uni-bonn.de: On 09/22/2012 12:17 PM, Olivier Grisel wrote: and to Andreas who finished in the 6th position out of 50 final submitters. This contest was about text classification: http://www.kaggle.com/c/detecting-insults-in-social-commentary Any

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Vivek Sharma
Thanks Olivier, Andreas. And, again to the text classification module authors. sklearn rocks! I think I was quite lucky, but I'm not complaining! :) My feature set was almost the same as the char and word features that Andreas used. I found that SVC gave me better performance than LR. And, some

[Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-22 Thread Christian Jauvin
Hi, I have been doing multiple experiments using a RandomForestClassifier (trained with the parallel code option) recently, without encountering any particular problem. However as soon as I began using a much bigger dataset (with the exact same code), I got this threading error: Exception in