Hi Dan, this kind of error can come from overflow. Are all of your test systems the same architecture?
On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, <dstromb...@grokstream.com> wrote: > Hi folks. > > I'm new to Scikit-learn. > > I have a very large Python project that seems to have a heisenbug which is > manifesting in scikit-learn code. > > Short of constructing an SSCCE, are there any magical techniques I should > try for pinning down the precise cause? Like valgrind or something? > > An SSCCE will most likely be pretty painful: the project has copious > shared, mutable state, and I've already tried a largish test program that > calls into the same code path with the error manifesting 0 times in 100. > > It's quite possible the root cause will turn out to be some other part of > the software stack. > > The traceback from pytest looks like: > sequential/test_training.py:101: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ > ../rt/classifier/coach.py:146: in train > **self.classifier_section > ../domain/classifier/factories/classifier_academy.py:115: in > create_classifier > **kwargs) > ../domain/classifier/factories/imp/xgb_factory.py:164: in create > clf_random.fit(X_train, y_train) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: > in fit > self._run_search(evaluate_candidates) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: > in _run_search > random_state=self.random_state)) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: > in evaluate_candidates > cv.split(X, y, groups))) > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: > in __call__ > self.retrieve() > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: > in retrieve > self._output.extend(job.get(timeout=self.timeout)) > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: > in wrap_future_result > return future.result(timeout=timeout) > /usr/lib/python3.6/concurrent/futures/_base.py:425: in result > return self.__get_result() > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ > > self = <Future at 0x7f15571ec7f0 state=finished raised ValueError> > > def __get_result(self): > if self._exception: > > raise self._exception > E ValueError: Input contains NaN, infinity or a value too large > for dtype('float32'). > > /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError > > > The above exception is raised about 12 to 14 times in 100 in full-blown > automated testing. > > Thanks for the cool software. > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn