subject:"\[Scikit\-learn\-general\] Discrepancy in SkLearn Stratified Cross Validation"

Re: [Scikit-learn-general] Discrepancy in SkLearn Stratified Cross Validation

2015-09-15 Thread Andy

train_test_split is not stratified. In master, you can use "stratify=y" to make it stratified. Also: randomness. On 09/15/2015 10:55 AM, Mamun Rashid wrote: I am seeing a discrepancy between classification performance between two cross validation technique using the same data. I was wondering

Re: [Scikit-learn-general] Discrepancy in SkLearn Stratified Cross Validation

2015-09-15 Thread Michael Eickenberg

I wouldn't expect those splits to be the same by nature. And additionally you are seeding the randomness differently in the two cases. Take a close look at the generated splits - maybe their composition already explains the discrepancies. On Tue, Sep 15, 2015 at 4:55 PM, Mamun Rashid wrote: > I

[Scikit-learn-general] Discrepancy in SkLearn Stratified Cross Validation

2015-09-15 Thread Mamun Rashid

I am seeing a discrepancy between classification performance between two cross validation technique using the same data. I was wondering if anyone can shed some light on this. Thanks in advance for your help. Mamun Method 1: cross_validation.train_test_split Method 2: StratifiedKFold. Two Exa