On training with a partial_fit function in scikit learn I get the following
error without the program terminating , how is that possible and what are
the repurcussions of this even though the trained model behaves correctly
and gives correct output. Is this something to worry about?
/usr/lib/python2.7/dist-packages/sklearn/naive_bayes.py:207:
RuntimeWarning: divide by zero encountered in log
self.class_log_prior_ = (np.log(self.class_count_)
I am using the following modified training function as I have to maintain a
constant list of labels\classes as the partial_fit does not allow adding
new classes\labels on subsequent runs , the class prior is same in each
batch of training data:
class MySklearnClassifier(SklearnClassifier):
def train(self, labeled_featuresets,classes=None, partial=True):
"""
Train (fit) the scikit-learn estimator.
:param labeled_featuresets: A list of ``(featureset, label)``
where each ``featureset`` is a dict mapping strings to
either
numbers, booleans or strings.
"""
X, y = list(compat.izip(*labeled_featuresets))
X = self._vectorizer.fit_transform(X)
y = self._encoder.fit_transform(y)
if partial:
classes=self._encoder.fit_transform(list(set(classes)))
self._clf.partial_fit(X, y, classes=list(set(classes)))
else:
self._clf.fit(X, y)
return self
Also on the second call to partial_fit it throws following error for class
count=2000 , and training samples are 3592 on calling model =
self.train(featureset, classes=labels,partial=partial):
self._clf.partial_fit(X, y, classes=list(set(classes)))
File "/usr/lib/python2.7/dist-packages/sklearn/naive_bayes.py", line
277, in partial_fit
self._count(X, Y)
File "/usr/lib/python2.7/dist-packages/sklearn/naive_bayes.py", line
443, in _count
self.feature_count_ += safe_sparse_dot(Y.T, X)
ValueError: operands could not be broadcast together with shapes
(2000,11430) (2000,10728) (2000,11430)
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general