This shouldn't be the case, though it's not altogether well-documented.
According to
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L1225,
if the fit_params value has the same length as the samples, it should be
similarly indexed.

So this would be a bug ... if it is found at master. I'm guessing, Hamed,
that you are using scikit-learn version 0.14? Please check this works with
the latest 0.15b.

However, fit_params will not account for the weights in the scoring
function. Noel has solved this
<https://github.com/scikit-learn/scikit-learn/pull/1574>; pending some more
tests, this should hopefully be merged, including support for
RandomizedSearchCV(..., sample_weight=weights_array) soon. (The work seems
to have stalled a little. If someone wants to see this feature included
quickly, perhaps Noel would be willing for someone else to finish this PR
for him.)

- Joel


On 8 July 2014 07:49, Kyle Kastner <kastnerk...@gmail.com> wrote:

> It looks like fit_params are passed wholesale to the classifier being fit
> - this means the sample weights will be a different size than the fold of
> (X, y) fed to the classifier (since the weights aren't getting KFolded...).
> Unfortunately I do not see a way to accomodate for this currently -
> sample_weights may be a special case where we would need to introspect the
> fit_params and modify them before passing to the underlying classifier...
> can you file a bug report on github?
>
>
> On Tue, Jul 8, 2014 at 1:27 PM, Hamed Zamani <hamedzam...@acm.org> wrote:
>
>> Dear all,
>>
>> I am using Scikit-Learn library and I want to weight all training samples
>> (according to unbalanced data). According to the tutorial and what I found
>> in the web, I should use this method:
>>
>> search = RandomizedSearchCV(estimator, param_distributions,
>> n_iter=args.iterations, scoring=mae_scorer,n_jobs=1, refit=True, 
>> cv=KFold(X_train.shape[0],
>> 10, shuffle=True, random_state=args.seed), verbose=1,
>> random_state=args.seed, fit_params={'sample_weight': weights_array})
>>
>> search.fit(X_trains, y_train)
>>
>> where "wights_array" is an array containing the weight of each training
>> sample. After running the code, I was stopped with the following exception:
>>
>> ValueError: operands could not be broadcast together with shapes (1118,)
>> (1006,) (1118,)
>>
>> It is worth noting that the size of "X_trains", "y_train", and
>> "weights_array" are equal to 1118.
>>
>> When I changed the number of folds from 10 to 2, the exception was
>> changed to this one:
>>
>> ValueError: operands could not be broadcast together with shapes (1118,)
>> (559,) (1118,)
>>
>> Do you know what is the problem? I guess the problem is with "KFold"
>> method. Any idea is appreciated.
>>
>> Kind Regards,
>> Hamed
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to