Hi,

This looks like the dataset from the Amazon challenge currently
running on Kaggle. When one-hot-encoded, you end up with rhoughly
15000 binary features, which means that the dense representation
requires at least 32000*15000*4 bytes to hold in memory (or even twice
as as more depending on your architecture). I let you do the math.

Gilles

On 20 June 2013 15:24, Joel Nothman <jnoth...@student.usyd.edu.au> wrote:
> Hi Maheshakya,
>
> It's probably right: your feature space is too big and sparse to be
> reasonable for random forests. What sort of categorical data are you
> encoding? What is the shape of the matrix after applying one-hot encoding?
>
> If you need to use random forests, and not a method that natively handles
> sparse data better, you will almost certainly need to reduce your feature
> space one way or another.
>
> - Joel
>
>
>
> On Thu, Jun 20, 2013 at 11:19 PM, Maheshakya Wijewardena
> <pmaheshak...@gmail.com> wrote:
>>
>> The shape is (32769, 8). There are 8 categorical variables before applying
>> OneHotEncoding.
>>
>>
>> On Thu, Jun 20, 2013 at 5:43 PM, Peter Prettenhofer
>> <peter.prettenho...@gmail.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> seems like your sparse matrix is too large to be converted to a dense
>>> matrix. What shape does X have? How many categorical variables do you have
>>> (before applying the OneHotTransformer)?
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to