And yes Gilles, It is the Amazon challenge :D

On Thu, Jun 20, 2013 at 8:21 PM, Maheshakya Wijewardena <
[email protected]> wrote:

> The shape of X after encoding is (32769, 16600). Seems as if that is too
> big to be converted into a dense matrix. Can Random forest handle this
> amount of features?
>
>
> On Thu, Jun 20, 2013 at 7:31 PM, Olivier Grisel 
> <[email protected]>wrote:
>
>> 2013/6/20 Lars Buitinck <[email protected]>:
>> > 2013/6/20 Olivier Grisel <[email protected]>:
>> >>> Actually twice as much, even on a 32-bit platform (float size is
>> >>> always 64 bits).
>> >>
>> >> The decision tree code always uses 32 bits floats:
>> >>
>> >>
>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L38
>> >>
>> >> but you have to cast your data to `dtype=np.float32` in fortran layout
>> >> ahead of time to avoid the memory copy.
>> >
>> > OneHot produces np.float, though, which is float64.
>>
>> Alright but you could convert it to np.float32 before calling toarray.
>> But anyway this kind of sparsity level is unsuitable for random
>> forests anyways I think.
>>
>> --
>> Olivier
>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to