[Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Pagliari, Roberto Thu, 11 Sep 2014 07:49:07 -0700

Hello,
Gridsearch with CV is something like this at a high level:

for every combination of parameters:
   for every partition of training data
     split training into train_cv and test_cv
     train_classifier(train_cv).predict(test_cv)
     compute score
   average score
   if max so far, then update best params


I woud like to do something like this:

for every combination of parameters:
   for every partition of training data
     split training into train_cv and test_cv
     scaler = StandardScaler()
     scaler.fit(train_cv)
     train_cv = scaler.transform(train_cv)
     test_cv = scaler.transform(test_cv)
    train_classifier(train_cv).predict(test_cv)
     compute score
   average score
   if max so far, then update best params

basically, I would like to scale training data and test data (using training 
data params) every time a CV train/test is generated.
Can someone suggest the best way to modify grid_search.py to do this?

Thank you,

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Reply via email to