2013/6/20 Gilles Louppe <[email protected]>:
> This looks like the dataset from the Amazon challenge currently
> running on Kaggle. When one-hot-encoded, you end up with rhoughly
> 15000 binary features, which means that the dense representation
> requires at least 32000*15000*4 bytes to hold in memory (or even twice
> as as more depending on your architecture). I let you do the math.

Actually twice as much, even on a 32-bit platform (float size is
always 64 bits).

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to