So Maheshakya's `toarray` might work with
`X.astype(np.float32).toarray('F')`...
(But by "might work" I mean won't throw a ValueError...)


On Thu, Jun 20, 2013 at 11:56 PM, Olivier Grisel
<olivier.gri...@ensta.org>wrote:

> 2013/6/20 Lars Buitinck <l.j.buiti...@uva.nl>:
> > 2013/6/20 Gilles Louppe <g.lou...@gmail.com>:
> >> This looks like the dataset from the Amazon challenge currently
> >> running on Kaggle. When one-hot-encoded, you end up with rhoughly
> >> 15000 binary features, which means that the dense representation
> >> requires at least 32000*15000*4 bytes to hold in memory (or even twice
> >> as as more depending on your architecture). I let you do the math.
> >
> > Actually twice as much, even on a 32-bit platform (float size is
> > always 64 bits).
>
> The decision tree code always uses 32 bits floats:
>
>
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L38
>
> but you have to cast your data to `dtype=np.float32` in fortran layout
> ahead of time to avoid the memory copy.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to