Hi, Yuri, Can you provide the shapes of val_x and val_y via val_x.shape and val_y.shape? Scikit-learn expects "X" to have the shape (n_samples, n_samples), and "y" should have the shape (n_samples,). For example, if your training dataset only consists of 1 column, it can be easily lead to problems. E.g., instead of
array([[1, 2, 3, 4]]) >>> X = np.array([1,2,3,4]) >>> X.shape (4,) you can transform the array as follows: >>> X.reshape(-1, 1) array([[1], [2], [3], [4]]) Best, Sebastian > On May 11, 2015, at 9:30 AM, Yury Zhauniarovich <y.zhalnerov...@gmail.com> > wrote: > > Dear all, > > I am quite new to sklearn and I do not know precisely but it seems that I > found a potential issue in CalibratedClassifierCV. I run result calibration > on SVC and get the following error: > Traceback (most recent call last): > File "svc_test_with_calibration.py", line 99, in <module> > cal_clf = CalibratedClassifierCV(clf, method='sigmoid', cv='prefit') > File "/usr/local/lib/python2.7/dist-packages/sklearn/calibration.py", line > 137, in fit > calibrated_classifier.fit(X, y) > File "/usr/local/lib/python2.7/dist-packages/sklearn/calibration.py", line > 309, in fit > calibrator.fit(this_df, Y[:, k], sample_weight) > IndexError: index 9 is out of bounds for axis 1 with size 9 > > Here is the code that I use: > #parameters > params = { > 'kernel': 'rbf', > 'C': 1.0, > 'shrinking': False, > 'degree': 3, > 'probability' : True, > 'gamma' : 0.0, > 'coef0' : 0.0, > 'cache_size' : 300, > 'class_weight' : None, > 'max_iter' : -1, > 'random_state' : 123, > 'penalty' : 'l2', > 'dual' : False, > } > > print "SVC..." > pretty_print(params) > > print "Training uncalibrated..." > clf = SVC(**params) > clf.fit(train_x, train_y) > uncal_clf_probs = clf.predict_proba(test_x) > > print "Calibrating..." > cal_clf = CalibratedClassifierCV(clf, method='sigmoid', cv='prefit') > cal_clf.fit(val_x, val_y) > cal_clf_probs = cal_clf.predict_proba(test_x) > > ll_uncal = log_loss(test_y, uncal_clf_probs) > ll_cal = log_loss(test_y, cal_clf_probs) > > The error happens in line: cal_clf.fit(val_x, val_y) However, if I run > similar code on ExtraTreesClassifier everything works as expected. Could > someone tell me if it is a bug s.t. I can report this issue on github? Or am > I doing something wrong? > > Best Regards, > Yury Zhauniarovich > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general