Re: [Scikit-learn-general] Random Forest with a mix of categorical and lexical features

2014-11-17 Thread Manish Amde
+1 Just wanted to point out that the K-1 subset proof is only true for binary classification. Such heuristics do perform reasonably for the multiclass classification criterion though. On Monday, November 17, 2014, Alexander Hawk tomahawkb...@gmail.com wrote: Perhaps you have become aware of

Re: [Scikit-learn-general] Trees with unbalanced classes

2013-07-12 Thread Manish Amde
Hi Sergey, There is a sample_weights option (not very well documented) in the random forest classifier that might help. You might want to check out the SVC example to see the sample_weights format. http://scikit-learn.org/stable/auto_examples/svm/plot_weighted_samples.html You can provide

Re: [Scikit-learn-general] Weighted and Balanced Random Forests

2013-03-20 Thread Manish Amde
fix that) Hope this helps, Gilles On 8 February 2013 00:44, Manish Amde manish...@gmail.com wrote: Fellow sklearners, I am working on a classification problem with an unbalanced data set and have been successful using SVM classifiers with the class_weight option. I have also tried

Re: [Scikit-learn-general] Imbalance in scikit-learn

2013-02-27 Thread Manish Amde
Using the sample_weight parameter in the RandomForestClassifier along with the balance_weights method from the preprocessing module to generate the sample weights might work as well. You can check this link for a previous related discussion.

[Scikit-learn-general] Weighted and Balanced Random Forests

2013-02-07 Thread Manish Amde
Fellow sklearners, I am working on a classification problem with an unbalanced data set and have been successful using SVM classifiers with the class_weight option. I have also tried Random Forests and am getting a decent ROC performance but I am hoping to get a performance improvement by using

Re: [Scikit-learn-general] Weighted and Balanced Random Forests

2013-02-07 Thread Manish Amde
, Gilles On 8 February 2013 00:44, Manish Amde manish...@gmail.com wrote: Fellow sklearners, I am working on a classification problem with an unbalanced data set and have been successful using SVM classifiers with the class_weight option. I have also tried Random Forests and am getting a decent