Re: [Scikit-learn-general] Fine tuning parameters of Multi label classification

Andreas Mueller Mon, 04 Jan 2016 10:52:15 -0800

You didn't use a OneVsRestClassifier. SGDClassifier itself can only domulti-class, not multi-label.

It needs to be GridSearchCV(OneVsRestClassifier(SGDClassifier()), ...)


On 01/04/2016 02:15 AM, Startup Hire wrote:

Providing the full StackTrace here:[ code in previous email]

# Tuning hyper-parameters for precision
()
---------------------------------------------------------------------------
ValueError                                 Traceback (most recent call last)
<ipython-input-85-7fedbaf85b7d>  in<module>() 18 scoring='%s_weighted' % score) 
19
---> 20clf.fit(Finaldata,  y)

2122 print("Best parameters set found on development set:")


D:\Anaconda\lib\site-packages\sklearn\grid_search.pyc  infit(self, X, y) 730
731          """
--> 732return self._fit(X,  y,  ParameterGrid(self.param_grid))

733734

D:\Anaconda\lib\site-packages\sklearn\grid_search.pyc  in_fit(self, X, y, 
parameter_iterable) 503 self.fit_params,  return_parameters=True,
504                                      error_score=self.error_score)
--> 505for parameters  in  parameter_iterable
506                  for train, test in cv)

507

D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.pyc  
in__call__(self, iterable) 657 self._iterating=  True
658              for  function,  args,  kwargsin  iterable:
--> 659self.dispatch(function,  args,  kwargs)

660661 if pre_dispatch== "all" or n_jobs== 1:


D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.pyc  indispatch(self, func, 
args, kwargs) 404 """ 405 if self._poolis  None:
--> 406job  =  ImmediateApply(func,  args,  kwargs)
407              index=  len(self._jobs)
408              if  not  _verbosity_filter(index,  self.verbose):

D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.pyc in__init__(self, func, args, kwargs) 138 # Don't delay the application,to avoid keeping the input

139          # arguments in memory
--> 140self.results=  func(*args,  **kwargs)

141142 def get(self):

D:\Anaconda\lib\site-packages\sklearn\cross_validation.pyc in_fit_and_score(estimator, X, y, scorer, train, test, verbose,parameters, fit_params, return_train_score, return_parameters,error_score) 1457 estimator.fit(X_train, **fit_params)

1458          else:
-> 1459estimator.fit(X_train,  y_train,  **fit_params)

14601461 except Exceptionas e:

D:\Anaconda\lib\site-packages\sklearn\linear_model\stochastic_gradient.pyc infit(self, X, y, coef_init, intercept_init, class_weight,sample_weight) 562 loss=self.loss, learning_rate=self.learning_rate,

563                           coef_init=coef_init,  
intercept_init=intercept_init,
--> 564sample_weight=sample_weight) 565

566D:\Anaconda\lib\site-packages\sklearn\linear_model\stochastic_gradient.pyc in_fit(self, X, y, alpha, C, loss, learning_rate, coef_init,intercept_init, sample_weight) 401 self.classes_= None402--> 403X, y= check_X_y(X, y, 'csr', dtype=np.float64, order="C")

404          n_samples,  n_features=  X.shape

405D:\Anaconda\lib\site-packages\sklearn\utils\validation.pyc incheck_X_y(X, y, accept_sparse, dtype, order, copy, force_all_finite,ensure_2d, allow_nd, multi_output, ensure_min_samples,ensure_min_features, y_numeric) 447 dtype=None) 448 else:

--> 449y  =  column_or_1d(y,  warn=True)
450          _assert_all_finite(y)
451      if  y_numericand  y.dtype.kind==  'O':

D:\Anaconda\lib\site-packages\sklearn\utils\validation.pyc  incolumn_or_1d(y, 
warn) 483 return np.ravel(y)

484--> 485raise ValueError("bad input shape {0}".format(shape))486487

ValueError: bad input shape (914551, 6)

On Mon, Dec 28, 2015 at 4:08 PM, Startup Hire<[email protected] <mailto:[email protected]>> wrote:


    Hi all,

    Hope you are doing well.

    I am working on fine tuning the following parameters in SGD
    Classifier which I am using inside OneVsRest Classifier.

    I am using GridSearch to use the same.

    I have following questions:

     1. How to use GridSearch to optimize OneVsRest Classifier?
     2. Any reason why the below code does not work? Error is bad
        input shape though the classifier.fit works find separately!






    from sklearn.grid_search import GridSearchCV


    # Set the parameters by cross-validation

    tuned_parameters = [{'alpha': [0.001, 0.01,0.1,0.5] ,
                         'penalty': ['l1','l2','elasticnet'],
     'loss':['log','modified_huber']}]


    scores = ['precision', 'recall']

    for score in scores:
        print("# Tuning hyper-parameters for %s" % score)
        print()

        clf =
    
GridSearchCV(SGDClassifier(random_state=0,learning_rate='optimal',class_weight='auto',n_iter=100),
    tuned_parameters, cv=5,
                           scoring='%s_weighted' % score)
        clf.fit(Finaldata, y)
        print("Best parameters set found on development set:")
        print()
        print(clf.best_params_)
        print()


    Regards,
    Sanant




------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Fine tuning parameters of Multi label classification

Reply via email to