I'm not trying to scale the dataset at the very beginning. I would like to 
scale while doing gridsearchCV.

Thanks,


-----Original Message-----
From: Pagliari, Roberto [mailto:[email protected]] 
Sent: Thursday, September 11, 2014 10:52 AM
To: [email protected]
Subject: Re: [Scikit-learn-general] modify gridsearch to scale cross-validation 
training/test dataset

I'm not sure how to do it when using gridsearch. Can you provide an example?

Thank you,


-----Original Message-----
From: Gael Varoquaux [mailto:[email protected]]
Sent: Thursday, September 11, 2014 10:50 AM
To: [email protected]
Subject: Re: [Scikit-learn-general] modify gridsearch to scale cross-validation 
training/test dataset

Use a pipeline.

G

On Thu, Sep 11, 2014 at 02:47:48PM +0000, Pagliari, Roberto wrote:
> Hello,

> Gridsearch with CV is something like this at a high level:



> for every combination of parameters:

>    for every partition of training data

>      split training into train_cv and test_cv

>      train_classifier(train_cv).predict(test_cv)

>      compute score

>    average score

>    if max so far, then update best params



> I woud like to do something like this:



> for every combination of parameters:

>    for every partition of training data

>      split training into train_cv and test_cv

>      scaler = StandardScaler()

>      scaler.fit(train_cv)

>      train_cv = scaler.transform(train_cv)

>      test_cv = scaler.transform(test_cv)

>     train_classifier(train_cv).predict(test_cv)

>      compute score

>    average score

>    if max so far, then update best params



> basically, I would like to scale training data and test data (using 
> training data params) every time a CV train/test is generated.

> Can someone suggest the best way to modify grid_search.py to do this?



> Thank you,




> ----------------------------------------------------------------------
> --------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce Perforce version control. 
> Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.
> clktrk

> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce Perforce version control. 
Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce Perforce version control. 
Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to