Thanks a lot. That should help!

Anirudh Kondaveeti
----------------------------


2009/3/20 Uwe Ligges <[email protected]>

>
>
> Uwe Ligges wrote:
>
>>
>>
>> Anirudh Kondaveeti wrote:
>>
>>> To be more clear,
>>>
>>> My data set contains two classes.. Class 1 and Class 2
>>> Class 1 has original data with 300 rows
>>> Class 2 is randomly generated data with 1500 rows.
>>>
>>> I want to sample a new data set with
>>> Class 1 - all the rows
>>> Class 2 - only 300 rows out of 1500 rows
>>>
>>> and then use it in random forest with 500 trees.
>>>
>>> Also the Class 2 should have different 300 rows for different trees in
>>> the
>>> forest. Thanks!
>>>
>>
>>
>> Ah, in that case (stratified sampling) combine arguments "strata" and
>> "sampsize", in principle, but you cannot select ALL rows of one class: you
>> somehow ignore one of the main ideas of randomForests to bootstrap
>> observations - and randomForest will certainly bootstrap for you.
>>
>
> In fact, you can also use  replace = FALSE  as well, but then, as I said,
> one of the main  ideas of randomForest is ignored....
>
> Uwe Ligges
>
>
>
>
>
>
>  Uwe Ligges
>>
>>
>>
>>  Anirudh Kondaveeti
>>> ----------------------------
>>>
>>>
>>> On Fri, Mar 20, 2009 at 1:45 PM, Anirudh Kondaveeti <
>>> [email protected]> wrote:
>>>
>>>  sampsize uses the same sample for all the trees in the random Forest.
>>>>
>>>> But I want to use different sample for each tree of the 500 trees in the
>>>> random Forest. Thanks!
>>>>
>>>>
>>>> Anirudh Kondaveeti
>>>> ----------------------------
>>>>
>>>>
>>>> 2009/3/20 Uwe Ligges <[email protected]>
>>>>
>>>>
>>>>  Anirudh Kondaveeti wrote:
>>>>>
>>>>>  Hi!
>>>>>>
>>>>>> I am dealing with random forest using R.
>>>>>>
>>>>>> Is there a way to sample a fixed no.of rows from a dataset for use
>>>>>> with
>>>>>> different trees in random Forest.
>>>>>> To be more clear, my data set contains 1500 rows, and I am growing 500
>>>>>> trees
>>>>>> in Random Forest
>>>>>> Is it possible to sample only 500 rows of data from the data set and
>>>>>> use
>>>>>> it
>>>>>> for different trees in the forest. I mean each tree of the forest
>>>>>> should
>>>>>> use
>>>>>> a different 500 rows from the data set.
>>>>>>
>>>>>>
>>>>> See ?randomForest and the argument sampsize.
>>>>>
>>>>> Uwe Ligges
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  Thanks in advance!
>>>>>>
>>>>>> Anirudh Kondaveeti
>>>>>> ----------------------------
>>>>>>
>>>>>>       [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> [email protected] mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>
>>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to