Source Code: clean_train_reviews=[] for review in train["review"]: clean_train_reviews.append(review_to_wordlist(review, remove_stopwords=True))
trainDataVecs=getAvgFeatureVecs(clean_train_reviews, model, num_features) print "Creating average feature vecs for test reviews" clean_test_reviews=[] for review in test["review"]: clean_test_reviews.append(review_to_wordlist(review,remove_stopwords=True)) testDataVecs=getAvgFeatureVecs(clean_test_reviews, model, num_features) forest = RandomForestClassifier(n_estimators = 100) forest = forest.fit(trainDataVecs, train["sentiment"]) result = forest.predict(testDataVecs) output = pd.DataFrame(data={"id":test["id"], "sentiment":result}) output.to_csv("Word2Vec_AverageVectors.csv", index=False, quoting=3) Error Message: Traceback (most recent call last): File "/test_IMDB_W2V_RF.py", line 224, in <module> result = forest.predict(testDataVecs) File "/.local/lib/python2.7/site-packages/sklearn/ensemble/forest.py", line 534, in predict proba = self.predict_proba(X) File "/.local/lib/python2.7/site-packages/sklearn/ensemble/forest.py", line 573, in predict_proba X = self._validate_X_predict(X) File "/.local/lib/python2.7/site-packages/sklearn/ensemble/forest.py", line 355, in _validate_X_predict return self.estimators_[0]._validate_X_predict(X, check_input=True) File "/.local/lib/python2.7/site-packages/sklearn/tree/tree.py", line 365, in _validate_X_predict X = check_array(X, dtype=DTYPE, accept_sparse="csr") File "/.local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 407, in check_array _assert_all_finite(array) File "/.local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 58, in _assert_all_finite " or a value too large for %r." % X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Process finished with exit code 1 Description : Can any one help with the error message. -- https://mail.python.org/mailman/listinfo/python-list