On 25 March 2014 05:55, Lars Buitinck <[email protected]> wrote:
> 2014-03-25 9:15 GMT+01:00 Arnaud Joly <[email protected]>:
>> 2. If the model of each SGD classifier is not sparse,  you might blow your 
>> memory.
>
> Actually the model is kept in dense form during training. But
> SGDClassifier's memory use has been drastically cut back in master
> recently.
>

This is really strange. I have shifted my implementation to EC2 with
60GB memory and I still get MemoryError!  Moreover only 3.5% of the
RAM is in usage when the memory error threw up.

LabelBinarizing is now successful with sparse-label_binarizer branch
but the model suffers in while passing data to partial_fit where I
have used 'toarray()' to convert CSR to dense form. I am going insane!

========================================================
#Trying partial_fit to classifier for each column
for i in range(Y.shape[1]):
    estimatorlist.extend([clone(classifier)])
    for index in range(0, X_train.shape[0], batch_size):
        print("Batch # = %d...\n", index/batch_size)
        X_mini, Y_mini = get_minibatch(X_train, Y, batch_size, index)
        estimatorlist[i].partial_fit(X_mini, Y_mini[:,i].toarray(), n_classes)
==========================================================

/home/ec2-user/.local/lib/python2.6/site-packages/sklearn/linear_model/stochastic_gradient.py:327:
DataConversionWarning: A column-vector y was passed when a 1d array
was expected. Please change the shape of y to (n_samples, ), for
example using ravel().
  y = column_or_1d(y, warn=True)
Traceback (most recent call last):
  File "mini-batch.py", line 65, in <module>
    estimatorlist[i].partial_fit(X_mini, Y_mini[:,i].toarray(), n_classes)
  File 
"/home/ec2-user/.local/lib/python2.6/site-packages/sklearn/linear_model/stochastic_gradient.py",
line 468, in partial_fit
    coef_init=None, intercept_init=None)
  File 
"/home/ec2-user/.local/lib/python2.6/site-packages/sklearn/linear_model/stochastic_gradient.py",
line 346, in _partial_fit
    coef_init, intercept_init)
  File 
"/home/ec2-user/.local/lib/python2.6/site-packages/sklearn/linear_model/stochastic_gradient.py",
line 175, in _allocate_parameter_mem
    dtype=np.float64, order="C")
MemoryError

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to