Hi all, I am having a different issue when trying to use sample_weights with RandomizedSearchCV:
weights = np.array(calculate_weighting(y_train)) search = RandomizedSearchCV(estimator, param_dist, n_iter=n_iter, scoring="accuracy", n_jobs=-1, iid=True, cv=5, refit=True, verbose=1, random_state=seed, fit_params={"sample_weight": weights}) search.fit(X_train, y_train) where weights has the same number of instances in X_train. I get the following error: ValueError: need more than 1 value to unpack I am using scikit-learn 0.16.1, therefore a more recent version than 0.15b. Was there some sort of change in the behavior of fit_params from 0.15b to 0.16.1? What is the current recommended way to pass the sample_weights vector to a *SearchCV object, if any? Thanks! José On Tue, Jul 8, 2014 at 9:33 AM, Hamed Zamani <hamedzam...@acm.org> wrote: > Dear Joel, > > Yes. After updating the version of Scikit-learn to 0.15b2 the problem was > solved. > > Thanks, > Hamed > > > > On Tue, Jul 8, 2014 at 2:51 PM, Joel Nothman <joel.noth...@gmail.com> wrote: >> >> This shouldn't be the case, though it's not altogether well-documented. >> According to >> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L1225, >> if the fit_params value has the same length as the samples, it should be >> similarly indexed. >> >> So this would be a bug ... if it is found at master. I'm guessing, Hamed, >> that you are using scikit-learn version 0.14? Please check this works with >> the latest 0.15b. >> >> However, fit_params will not account for the weights in the scoring >> function. Noel has solved this; pending some more tests, this should >> hopefully be merged, including support for RandomizedSearchCV(..., >> sample_weight=weights_array) soon. (The work seems to have stalled a little. >> If someone wants to see this feature included quickly, perhaps Noel would be >> willing for someone else to finish this PR for him.) >> >> - Joel >> >> >> On 8 July 2014 07:49, Kyle Kastner <kastnerk...@gmail.com> wrote: >>> >>> It looks like fit_params are passed wholesale to the classifier being fit >>> - this means the sample weights will be a different size than the fold of >>> (X, y) fed to the classifier (since the weights aren't getting KFolded...). >>> Unfortunately I do not see a way to accomodate for this currently - >>> sample_weights may be a special case where we would need to introspect the >>> fit_params and modify them before passing to the underlying classifier... >>> can you file a bug report on github? >>> >>> >>> On Tue, Jul 8, 2014 at 1:27 PM, Hamed Zamani <hamedzam...@acm.org> wrote: >>>> >>>> Dear all, >>>> >>>> I am using Scikit-Learn library and I want to weight all training >>>> samples (according to unbalanced data). According to the tutorial and what >>>> I >>>> found in the web, I should use this method: >>>> >>>> search = RandomizedSearchCV(estimator, param_distributions, >>>> n_iter=args.iterations, scoring=mae_scorer,n_jobs=1, refit=True, >>>> cv=KFold(X_train.shape[0], 10, shuffle=True, random_state=args.seed), >>>> verbose=1, random_state=args.seed, fit_params={'sample_weight': >>>> weights_array}) >>>> >>>> search.fit(X_trains, y_train) >>>> >>>> where "wights_array" is an array containing the weight of each training >>>> sample. After running the code, I was stopped with the following exception: >>>> >>>> ValueError: operands could not be broadcast together with shapes (1118,) >>>> (1006,) (1118,) >>>> >>>> It is worth noting that the size of "X_trains", "y_train", and >>>> "weights_array" are equal to 1118. >>>> >>>> When I changed the number of folds from 10 to 2, the exception was >>>> changed to this one: >>>> >>>> ValueError: operands could not be broadcast together with shapes (1118,) >>>> (559,) (1118,) >>>> >>>> Do you know what is the problem? I guess the problem is with "KFold" >>>> method. Any idea is appreciated. >>>> >>>> Kind Regards, >>>> Hamed >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Open source business process management suite built on Java and Eclipse >>>> Turn processes into business applications with Bonita BPM Community >>>> Edition >>>> Quickly connect people, data, and systems into organized workflows >>>> Winner of BOSSIE, CODIE, OW2 and Gartner awards >>>> http://p.sf.net/sfu/Bonitasoft >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Open source business process management suite built on Java and Eclipse >>> Turn processes into business applications with Bonita BPM Community >>> Edition >>> Quickly connect people, data, and systems into organized workflows >>> Winner of BOSSIE, CODIE, OW2 and Gartner awards >>> http://p.sf.net/sfu/Bonitasoft >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >> >> >> >> ------------------------------------------------------------------------------ >> Open source business process management suite built on Java and Eclipse >> Turn processes into business applications with Bonita BPM Community >> Edition >> Quickly connect people, data, and systems into organized workflows >> Winner of BOSSIE, CODIE, OW2 and Gartner awards >> http://p.sf.net/sfu/Bonitasoft >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> > > > ------------------------------------------------------------------------------ > Open source business process management suite built on Java and Eclipse > Turn processes into business applications with Bonita BPM Community Edition > Quickly connect people, data, and systems into organized workflows > Winner of BOSSIE, CODIE, OW2 and Gartner awards > http://p.sf.net/sfu/Bonitasoft > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general