Hi Ji,

What you are trying to do is called 'online fitting'. Only a small number
of models can do online fitting. This is implemented in the
scikit-learn with a 'partial_fit' method. As far as supervised learning
goes, only SGD does online learning, I believe.
http://scikit-learn.org/stable/modules/sgd.html

HTH,

Gael

On Sat, Jul 28, 2012 at 08:18:01PM -0700, Ji H. Park wrote:
> I'm using IPython notebook as the programming environment, and pandas and
> sklearn packages to analyze data from Digit Recognizer
> Tutorial<http://www.kaggle.com/c/digit-recognizer/data>
> .

> The data is available on the webpage (link above), and the attached is my
> ipython notebook.

> KNeighborsClassifier is used for the prediction.

> Problem:

> "MemoryError" occurs when loading large dataset using read_csv function. To
> bypass this problem temporarily, I have to restart the kernel, which
> then read_csv function successfully loads the file, but the same error
> occurs when I run the same cell again.

> Anyway, when the read_csv function loads the file successfully, after
> making changes to the dataframe, I can pass the features and labels to the
> KNeighborsClassifier's fit function. At this point, similar memory error
> occurs.

> I tried the following:
> Iterate through the CSV file in chunks, and fit the data accordingly, but
> the problem is that the the predictive model is overwritten every time it
> fits a chunk of data...

> What can I do to make this work?

> Thanks!


-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to