Hi, Yuri,
Can you provide the shapes of val_x and val_y via val_x.shape and val_y.shape?
Scikit-learn expects "X" to have the shape (n_samples, n_samples), and "y"
should have the shape (n_samples,).
For example, if your training dataset only consists of 1 column, it can be
easily lead to problems. E.g., instead of
array([[1, 2, 3, 4]])
>>> X = np.array([1,2,3,4])
>>> X.shape
(4,)
you can transform the array as follows:
>>> X.reshape(-1, 1)
array([[1],
[2],
[3],
[4]])
Best,
Sebastian
> On May 11, 2015, at 9:30 AM, Yury Zhauniarovich <[email protected]>
> wrote:
>
> Dear all,
>
> I am quite new to sklearn and I do not know precisely but it seems that I
> found a potential issue in CalibratedClassifierCV. I run result calibration
> on SVC and get the following error:
> Traceback (most recent call last):
> File "svc_test_with_calibration.py", line 99, in <module>
> cal_clf = CalibratedClassifierCV(clf, method='sigmoid', cv='prefit')
> File "/usr/local/lib/python2.7/dist-packages/sklearn/calibration.py", line
> 137, in fit
> calibrated_classifier.fit(X, y)
> File "/usr/local/lib/python2.7/dist-packages/sklearn/calibration.py", line
> 309, in fit
> calibrator.fit(this_df, Y[:, k], sample_weight)
> IndexError: index 9 is out of bounds for axis 1 with size 9
>
> Here is the code that I use:
> #parameters
> params = {
> 'kernel': 'rbf',
> 'C': 1.0,
> 'shrinking': False,
> 'degree': 3,
> 'probability' : True,
> 'gamma' : 0.0,
> 'coef0' : 0.0,
> 'cache_size' : 300,
> 'class_weight' : None,
> 'max_iter' : -1,
> 'random_state' : 123,
> 'penalty' : 'l2',
> 'dual' : False,
> }
>
> print "SVC..."
> pretty_print(params)
>
> print "Training uncalibrated..."
> clf = SVC(**params)
> clf.fit(train_x, train_y)
> uncal_clf_probs = clf.predict_proba(test_x)
>
> print "Calibrating..."
> cal_clf = CalibratedClassifierCV(clf, method='sigmoid', cv='prefit')
> cal_clf.fit(val_x, val_y)
> cal_clf_probs = cal_clf.predict_proba(test_x)
>
> ll_uncal = log_loss(test_y, uncal_clf_probs)
> ll_cal = log_loss(test_y, cal_clf_probs)
>
> The error happens in line: cal_clf.fit(val_x, val_y) However, if I run
> similar code on ExtraTreesClassifier everything works as expected. Could
> someone tell me if it is a bug s.t. I can report this issue on github? Or am
> I doing something wrong?
>
> Best Regards,
> Yury Zhauniarovich
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general