Hi Ji, What you are trying to do is called 'online fitting'. Only a small number of models can do online fitting. This is implemented in the scikit-learn with a 'partial_fit' method. As far as supervised learning goes, only SGD does online learning, I believe. http://scikit-learn.org/stable/modules/sgd.html
HTH, Gael On Sat, Jul 28, 2012 at 08:18:01PM -0700, Ji H. Park wrote: > I'm using IPython notebook as the programming environment, and pandas and > sklearn packages to analyze data from Digit Recognizer > Tutorial<http://www.kaggle.com/c/digit-recognizer/data> > . > The data is available on the webpage (link above), and the attached is my > ipython notebook. > KNeighborsClassifier is used for the prediction. > Problem: > "MemoryError" occurs when loading large dataset using read_csv function. To > bypass this problem temporarily, I have to restart the kernel, which > then read_csv function successfully loads the file, but the same error > occurs when I run the same cell again. > Anyway, when the read_csv function loads the file successfully, after > making changes to the dataframe, I can pass the features and labels to the > KNeighborsClassifier's fit function. At this point, similar memory error > occurs. > I tried the following: > Iterate through the CSV file in chunks, and fit the data accordingly, but > the problem is that the the predictive model is overwritten every time it > fits a chunk of data... > What can I do to make this work? > Thanks! -- Gael Varoquaux Researcher, INRIA Parietal Laboratoire de Neuro-Imagerie Assistee par Ordinateur NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
