2012/1/23 Mathieu Blondel <[email protected]>:
> We need a dump utility to incrementally append data to a mem-mapped
> array or csr matrix. This way, people would be able to do their
> feature extraction in an iterator and create the array / matrix
> incrementally.

I agree although this would be really useful once I am done with the
hashing text vectorizer. Otherwise the vocabulary dict will explode in
memory. Alternatively we could make a vocabulary dict implementation
based on a redis server.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to