Re: [Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Josh Vredevoogd Thu, 11 Sep 2014 09:53:30 -0700

You're missing estimators = in the first line, I guess.
params should be:
params = dict(linear_svm__C=[0.1, 10, 100])


On Thu, Sep 11, 2014 at 9:44 AM, Pagliari, Roberto <[email protected]>
wrote:

> Hi,
> Yes, I think you are right.
>
> Is the code below how it should be done (scaling+linearsvc)?
>
> [('scaler', Scaler()), ('linear_svm', LinearSVC())]
> clf = Pipeline(estimators)
> params = dict(LinearSVC__C=[0.1, 10, 100])
> gs = GridSearchCV(clf, param_grid=params)
>
> Thank you,
>
>
> -----Original Message-----
> From: Laurent Direr [mailto:[email protected]]
> Sent: Thursday, September 11, 2014 11:15 AM
> To: [email protected]
> Subject: Re: [Scikit-learn-general] modify gridsearch to scale
> cross-validation training/test dataset
>
> Hello,
>
> I think a pipeline does precisely what you are asking for:
> http://scikit-learn.org/stable/modules/pipeline.html
>
> If you include the scaler as a step in the pipeline it should behave the
> way you described in your first email.
>
> Laurent
>
> On 09/11/2014 04:59 PM, Pagliari, Roberto wrote:
> > I'm not trying to scale the dataset at the very beginning. I would like
> to scale while doing gridsearchCV.
> >
> > Thanks,
> >
> >
> > -----Original Message-----
> > From: Pagliari, Roberto [mailto:[email protected]]
> > Sent: Thursday, September 11, 2014 10:52 AM
> > To: [email protected]
> > Subject: Re: [Scikit-learn-general] modify gridsearch to scale
> > cross-validation training/test dataset
> >
> > I'm not sure how to do it when using gridsearch. Can you provide an
> example?
> >
> > Thank you,
> >
> >
> > -----Original Message-----
> > From: Gael Varoquaux [mailto:[email protected]]
> > Sent: Thursday, September 11, 2014 10:50 AM
> > To: [email protected]
> > Subject: Re: [Scikit-learn-general] modify gridsearch to scale
> > cross-validation training/test dataset
> >
> > Use a pipeline.
> >
> > G
> >
> > On Thu, Sep 11, 2014 at 02:47:48PM +0000, Pagliari, Roberto wrote:
> >> Hello,
> >> Gridsearch with CV is something like this at a high level:
> >
> >
> >> for every combination of parameters:
> >>     for every partition of training data
> >>       split training into train_cv and test_cv
> >>       train_classifier(train_cv).predict(test_cv)
> >>       compute score
> >>     average score
> >>     if max so far, then update best params
> >
> >
> >> I woud like to do something like this:
> >
> >
> >> for every combination of parameters:
> >>     for every partition of training data
> >>       split training into train_cv and test_cv
> >>       scaler = StandardScaler()
> >>       scaler.fit(train_cv)
> >>       train_cv = scaler.transform(train_cv)
> >>       test_cv = scaler.transform(test_cv)
> >>      train_classifier(train_cv).predict(test_cv)
> >>       compute score
> >>     average score
> >>     if max so far, then update best params
> >
> >
> >> basically, I would like to scale training data and test data (using
> >> training data params) every time a CV train/test is generated.
> >> Can someone suggest the best way to modify grid_search.py to do this?
> >
> >
> >> Thank you,
> >
> >
> >
> >> ---------------------------------------------------------------------
> >> -
> >> --------
> >> Want excitement?
> >> Manually upgrade your production database.
> >> When you want reliability, choose Perforce Perforce version control.
> >> Predictably reliable.
> >> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.
> >> clktrk
> >> _______________________________________________
> >> Scikit-learn-general mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce Perforce version control.
> Predictably reliable.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce
> Perforce version control. Predictably reliable.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Reply via email to