Dear all,

using sklearn 0.13 (fresh Ubuntu 12.04 installation), I'm getting the error
below, which I belief is a memory error. What strikes me is that I'm using
a machine with 512GB of RAM, so that shouldn't be happening.

Is there maybe a system setting that limits the amount of RAM on a user
basis?

With n_features=14000, this is the memory usage
In [5]: %memit cross_val_score(clf, X, y=y, score_func=score_func, cv=cv,
n_jobs=2, verbose=0, fit_params=None)
maximum of 1: 6997.214844 MB per loop

Increasing the amount of features to n_features=150000 raises an error.
Here is a minimalistic example:

n_samples=1000 # per class
n_features=150000
X=np.random.randn(n_samples*2, n_features)
y = np.repeat([0,1], n_samples)

clf = svm.LinearSVC(C=1)
score_func = accuracy_score
cv = KFold(y.size, n_folds=3)
scores = cross_val_score(clf, X, y=y, score_func=score_func, cv=cv,
n_jobs=2, verbose=0, fit_params=None)

Error
-------
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in
_handle_tasks
    put(task)
SystemError: NULL result without error in PyObject_Call

Appreciate your help,
 Matthias
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to