Hello,
Gridsearch with CV is something like this at a high level:
for every combination of parameters:
for every partition of training data
split training into train_cv and test_cv
train_classifier(train_cv).predict(test_cv)
compute score
average score
if max so far, then update best params
I woud like to do something like this:
for every combination of parameters:
for every partition of training data
split training into train_cv and test_cv
scaler = StandardScaler()
scaler.fit(train_cv)
train_cv = scaler.transform(train_cv)
test_cv = scaler.transform(test_cv)
train_classifier(train_cv).predict(test_cv)
compute score
average score
if max so far, then update best params
basically, I would like to scale training data and test data (using training
data params) every time a CV train/test is generated.
Can someone suggest the best way to modify grid_search.py to do this?
Thank you,
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general