Hi All, Greetings !
I am getting JoblibMemoryError while executing a scikit-learn RandomForestClassifier code. Here is my algorithm in short: from sklearn.ensemble import RandomForestClassifier from sklearn.cross_validation import train_test_split import pandas as pd import numpy as np clf = RandomForestClassifier(n_estimators=5000, n_jobs=1000) clf.fit(p_input_features_train,p_input_labels_train) The dataframe p_input_features contain 134 columns (features) and 5 million rows (observations). The exact *error message* is given below: Executing Random Forest Classifier Traceback (most recent call last): File "/home/user/rf_fold.py", line 43, in <module> clf.fit(p_features_train,p_labels_train) File "/var/opt/ lib/python2.7/site-packages/sklearn/ensemble/forest.py", line 290, in fit for i, t in enumerate(trees)) File "/var/opt/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 810, in __call__ self.retrieve() File "/var/opt/lib /python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 757, in retrieve raise exception sklearn.externals.joblib.my_exceptions.JoblibMemoryError: JoblibMemoryError ___________________________________________________________________________ Multiprocessing exception: ........................................................................... /var/opt/lib/python2.7/site-packages/sklearn/ensemble/forest.py in fit(self=RandomForestClassifier(bootstrap=True, class_wei...te=None, verbose=0, warm_start=False), X=array([[ 0. , 0. , 0. , .... 0. , 0. ]], dtype=float32), y=array([[ 0.], [ 0.], [ 0.], ..., [ 0.], [ 0.], [ 0.]]), sample_weight=None) 285 trees = Parallel(n_jobs=self.n_jobs, verbose=self.verbose, 286 backend="threading")( 287 delayed(_parallel_build_trees)( 288 t, self, X, y, sample_weight, i, len(trees), 289 verbose=self.verbose, class_weight=self.class_weight) --> 290 for i, t in enumerate(trees)) i = 4999 291 292 # Collect newly grown trees 293 self.estimators_.extend(trees) 294 ........................................................................... Please can you help me to identify a possible resolution to this. Thanks, Debu
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn