you can try an ordinal encoding instead - just map each categorical value
to an integer so that you end up with 8 numerical features - if you use
enough trees and grow them deep it may work


2013/6/20 Maheshakya Wijewardena <pmaheshak...@gmail.com>

> And yes Gilles, It is the Amazon challenge :D
>
>
> On Thu, Jun 20, 2013 at 8:21 PM, Maheshakya Wijewardena <
> pmaheshak...@gmail.com> wrote:
>
>> The shape of X after encoding is (32769, 16600). Seems as if that is too
>> big to be converted into a dense matrix. Can Random forest handle this
>> amount of features?
>>
>>
>> On Thu, Jun 20, 2013 at 7:31 PM, Olivier Grisel <olivier.gri...@ensta.org
>> > wrote:
>>
>>> 2013/6/20 Lars Buitinck <l.j.buiti...@uva.nl>:
>>> > 2013/6/20 Olivier Grisel <olivier.gri...@ensta.org>:
>>> >>> Actually twice as much, even on a 32-bit platform (float size is
>>> >>> always 64 bits).
>>> >>
>>> >> The decision tree code always uses 32 bits floats:
>>> >>
>>> >>
>>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#L38
>>> >>
>>> >> but you have to cast your data to `dtype=np.float32` in fortran layout
>>> >> ahead of time to avoid the memory copy.
>>> >
>>> > OneHot produces np.float, though, which is float64.
>>>
>>> Alright but you could convert it to np.float32 before calling toarray.
>>> But anyway this kind of sparsity level is unsuitable for random
>>> forests anyways I think.
>>>
>>> --
>>> Olivier
>>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Peter Prettenhofer
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to