Re: [Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Pagliari, Roberto Thu, 11 Sep 2014 11:01:23 -0700

I'm getting errors when using these parameters

linear_svm__penalty
linear_svm__loss
linear_svm__dual


don't they have the same names? 

I tried linear_svc but it doesn't work either 

Thank you


-----Original Message-----
From: Laurent Direr [mailto:[email protected]] 
Sent: Thursday, September 11, 2014 12:57 PM
To: [email protected]
Subject: Re: [Scikit-learn-general] modify gridsearch to scale cross-validation 
training/test dataset

Hi,

If you test this code you will see it raises an error ;).

The naming of the parameters in the param_grid should  be consistent with the 
names in the Pipeline object.
GridSearchCV performs grid search on the Pipeline object so it cannot 
understand what the 'LinearSVC__C' parameter means.

If you replace it with 'linear_svm__C' it works just fine.


On 09/11/2014 06:44 PM, Pagliari, Roberto wrote:
> Hi,
> Yes, I think you are right.
>
> Is the code below how it should be done (scaling+linearsvc)?
>
> [('scaler', Scaler()), ('linear_svm', LinearSVC())] clf = 
> Pipeline(estimators) params = dict(LinearSVC__C=[0.1, 10, 100]) gs = 
> GridSearchCV(clf, param_grid=params)
>
> Thank you,
>
>
> -----Original Message-----
> From: Laurent Direr [mailto:[email protected]]
> Sent: Thursday, September 11, 2014 11:15 AM
> To: [email protected]
> Subject: Re: [Scikit-learn-general] modify gridsearch to scale 
> cross-validation training/test dataset
>
> Hello,
>
> I think a pipeline does precisely what you are asking for:
> http://scikit-learn.org/stable/modules/pipeline.html
>
> If you include the scaler as a step in the pipeline it should behave the way 
> you described in your first email.
>
> Laurent
>
> On 09/11/2014 04:59 PM, Pagliari, Roberto wrote:
>> I'm not trying to scale the dataset at the very beginning. I would like to 
>> scale while doing gridsearchCV.
>>
>> Thanks,
>>
>>
>> -----Original Message-----
>> From: Pagliari, Roberto [mailto:[email protected]]
>> Sent: Thursday, September 11, 2014 10:52 AM
>> To: [email protected]
>> Subject: Re: [Scikit-learn-general] modify gridsearch to scale 
>> cross-validation training/test dataset
>>
>> I'm not sure how to do it when using gridsearch. Can you provide an example?
>>
>> Thank you,
>>
>>
>> -----Original Message-----
>> From: Gael Varoquaux [mailto:[email protected]]
>> Sent: Thursday, September 11, 2014 10:50 AM
>> To: [email protected]
>> Subject: Re: [Scikit-learn-general] modify gridsearch to scale 
>> cross-validation training/test dataset
>>
>> Use a pipeline.
>>
>> G
>>
>> On Thu, Sep 11, 2014 at 02:47:48PM +0000, Pagliari, Roberto wrote:
>>> Hello,
>>> Gridsearch with CV is something like this at a high level:
>>
>>> for every combination of parameters:
>>>      for every partition of training data
>>>        split training into train_cv and test_cv
>>>        train_classifier(train_cv).predict(test_cv)
>>>        compute score
>>>      average score
>>>      if max so far, then update best params
>>
>>> I woud like to do something like this:
>>
>>> for every combination of parameters:
>>>      for every partition of training data
>>>        split training into train_cv and test_cv
>>>        scaler = StandardScaler()
>>>        scaler.fit(train_cv)
>>>        train_cv = scaler.transform(train_cv)
>>>        test_cv = scaler.transform(test_cv)
>>>       train_classifier(train_cv).predict(test_cv)
>>>        compute score
>>>      average score
>>>      if max so far, then update best params
>>
>>> basically, I would like to scale training data and test data (using 
>>> training data params) every time a CV train/test is generated.
>>> Can someone suggest the best way to modify grid_search.py to do this?
>>
>>> Thank you,
>>
>>
>>> --------------------------------------------------------------------
>>> -
>>> -
>>> --------
>>> Want excitement?
>>> Manually upgrade your production database.
>>> When you want reliability, choose Perforce Perforce version control.
>>> Predictably reliable.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.
>>> clktrk
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ----------------------------------------------------------------------
> --------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce Perforce version control. 
> Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.
> clktrk _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ----------------------------------------------------------------------
> --------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce Perforce version control. 
> Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.
> clktrk _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce Perforce version control. 
Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] modify gridsearch to scale cross-validation training/test dataset

Reply via email to