Hi all,

I am having a different issue when trying to use sample_weights with
RandomizedSearchCV:

weights = np.array(calculate_weighting(y_train))
search = RandomizedSearchCV(estimator, param_dist, n_iter=n_iter,
scoring="accuracy",
                                         n_jobs=-1, iid=True, cv=5,
refit=True, verbose=1, random_state=seed,
                                         fit_params={"sample_weight": weights})

search.fit(X_train, y_train)

where weights has the same number of instances in X_train.
I get the following error:

ValueError: need more than 1 value to unpack

I am using scikit-learn 0.16.1, therefore a more recent version than
0.15b. Was there some sort of change in the behavior of fit_params
from 0.15b to 0.16.1?

What is the current recommended way to pass the sample_weights vector
to a *SearchCV object, if any?

Thanks!
José


On Tue, Jul 8, 2014 at 9:33 AM, Hamed Zamani <hamedzam...@acm.org> wrote:
> Dear Joel,
>
> Yes. After updating the version of Scikit-learn to 0.15b2 the problem was
> solved.
>
> Thanks,
> Hamed
>
>
>
> On Tue, Jul 8, 2014 at 2:51 PM, Joel Nothman <joel.noth...@gmail.com> wrote:
>>
>> This shouldn't be the case, though it's not altogether well-documented.
>> According to
>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L1225,
>> if the fit_params value has the same length as the samples, it should be
>> similarly indexed.
>>
>> So this would be a bug ... if it is found at master. I'm guessing, Hamed,
>> that you are using scikit-learn version 0.14? Please check this works with
>> the latest 0.15b.
>>
>> However, fit_params will not account for the weights in the scoring
>> function. Noel has solved this; pending some more tests, this should
>> hopefully be merged, including support for RandomizedSearchCV(...,
>> sample_weight=weights_array) soon. (The work seems to have stalled a little.
>> If someone wants to see this feature included quickly, perhaps Noel would be
>> willing for someone else to finish this PR for him.)
>>
>> - Joel
>>
>>
>> On 8 July 2014 07:49, Kyle Kastner <kastnerk...@gmail.com> wrote:
>>>
>>> It looks like fit_params are passed wholesale to the classifier being fit
>>> - this means the sample weights will be a different size than the fold of
>>> (X, y) fed to the classifier (since the weights aren't getting KFolded...).
>>> Unfortunately I do not see a way to accomodate for this currently -
>>> sample_weights may be a special case where we would need to introspect the
>>> fit_params and modify them before passing to the underlying classifier...
>>> can you file a bug report on github?
>>>
>>>
>>> On Tue, Jul 8, 2014 at 1:27 PM, Hamed Zamani <hamedzam...@acm.org> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I am using Scikit-Learn library and I want to weight all training
>>>> samples (according to unbalanced data). According to the tutorial and what 
>>>> I
>>>> found in the web, I should use this method:
>>>>
>>>> search = RandomizedSearchCV(estimator, param_distributions,
>>>> n_iter=args.iterations, scoring=mae_scorer,n_jobs=1, refit=True,
>>>> cv=KFold(X_train.shape[0], 10, shuffle=True, random_state=args.seed),
>>>> verbose=1, random_state=args.seed, fit_params={'sample_weight':
>>>> weights_array})
>>>>
>>>> search.fit(X_trains, y_train)
>>>>
>>>> where "wights_array" is an array containing the weight of each training
>>>> sample. After running the code, I was stopped with the following exception:
>>>>
>>>> ValueError: operands could not be broadcast together with shapes (1118,)
>>>> (1006,) (1118,)
>>>>
>>>> It is worth noting that the size of "X_trains", "y_train", and
>>>> "weights_array" are equal to 1118.
>>>>
>>>> When I changed the number of folds from 10 to 2, the exception was
>>>> changed to this one:
>>>>
>>>> ValueError: operands could not be broadcast together with shapes (1118,)
>>>> (559,) (1118,)
>>>>
>>>> Do you know what is the problem? I guess the problem is with "KFold"
>>>> method. Any idea is appreciated.
>>>>
>>>> Kind Regards,
>>>> Hamed
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Open source business process management suite built on Java and Eclipse
>>>> Turn processes into business applications with Bonita BPM Community
>>>> Edition
>>>> Quickly connect people, data, and systems into organized workflows
>>>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>>>> http://p.sf.net/sfu/Bonitasoft
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Open source business process management suite built on Java and Eclipse
>>> Turn processes into business applications with Bonita BPM Community
>>> Edition
>>> Quickly connect people, data, and systems into organized workflows
>>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>>> http://p.sf.net/sfu/Bonitasoft
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to