[Scikit-learn-general] K-Fold-Cross-validation in Scikit-Learn

nmura...@masonlive.gmu.edu Tue, 28 Apr 2015 19:41:03 -0700

Hello,


I am very new to scikit-learn and am trying to run cross-validation on a data 
frame consisting of text features, classification class. I am trying to perform 
text data classification. It is a 2-class classification problem where the 
distribution between positive and negative instances is extremely skewed ( we 
want to keep it that way on purpose ). Is there a specific cross-validation 
type in scikit-learn, where I am able to split each of the K-folds so that each 
fold has the same proportion of the positive and negative examples?  Meaning if 
I have :


100 Positive instances

1000 Negative instances,


would it be possible for me to run a 10 fold Cross-validation where each fold 
has 10 +ve and 100 -ve examples randomly chosen from the set, held out as the 
validation set?


Some sample code or a link with the same would be helpful.


Thanks,

Nikhil

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] K-Fold-Cross-validation in Scikit-Learn

Reply via email to