Hi Rose,
That's a good question. Is your y in this case string labels? I believe this is
bug that occurs because label encoding is happening both in the SGDClassifier
and the compute_class_weight function. I've posted a work-around on your
stackoverflow question so you can go ahead and give that a try.
Can you open up an issue for this on github?
Hope that helps,
Danny
From: [email protected]
Date: Thu, 24 Jul 2014 18:12:36 -0700
To: [email protected]
Subject: [Scikit-learn-general] SGDClassifier with class_weight=auto fails
on linux, but not on osx
When I train an scikit-learn SGDClassifier with these options:
SGDClassifier(loss='log', class_weight=None, penalty='l2'), training completes
with no error.
When I train this classifier with class_weight='auto' on my mac, training still
completes with no error.
Yet when I train this classifier with class_weight='auto' on a linux ec2
instance, I get this error:
return self.model.fit(X, y)
File
"/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/stochastic_gradient.py",
line 484, in fit
sample_weight=sample_weight)
File
"/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/stochastic_gradient.py",
line 388, in _fit
classes, sample_weight, coef_init, intercept_init)
File
"/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/stochastic_gradient.py",
line 335, in _partial_fit
y_ind)
File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/class_weight.py",
line 43, in compute_class_weight
raise ValueError("classes should have valid labels that are in y")
What could cause it? I'm running the latest version of scikit-learn. (On
macosx, it worked on both 0.14 and 0.15. On linux, it failed on both 0.15.0b1
and 0.15)
Here's the StackOverflow question I
posted:http://stackoverflow.com/questions/24808821/sgdclassifier-with-class-weight-auto-fails-on-linux-but-not-on-osx
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general