Thanks a lot. That should help! Anirudh Kondaveeti ----------------------------
2009/3/20 Uwe Ligges <[email protected]> > > > Uwe Ligges wrote: > >> >> >> Anirudh Kondaveeti wrote: >> >>> To be more clear, >>> >>> My data set contains two classes.. Class 1 and Class 2 >>> Class 1 has original data with 300 rows >>> Class 2 is randomly generated data with 1500 rows. >>> >>> I want to sample a new data set with >>> Class 1 - all the rows >>> Class 2 - only 300 rows out of 1500 rows >>> >>> and then use it in random forest with 500 trees. >>> >>> Also the Class 2 should have different 300 rows for different trees in >>> the >>> forest. Thanks! >>> >> >> >> Ah, in that case (stratified sampling) combine arguments "strata" and >> "sampsize", in principle, but you cannot select ALL rows of one class: you >> somehow ignore one of the main ideas of randomForests to bootstrap >> observations - and randomForest will certainly bootstrap for you. >> > > In fact, you can also use replace = FALSE as well, but then, as I said, > one of the main ideas of randomForest is ignored.... > > Uwe Ligges > > > > > > > Uwe Ligges >> >> >> >> Anirudh Kondaveeti >>> ---------------------------- >>> >>> >>> On Fri, Mar 20, 2009 at 1:45 PM, Anirudh Kondaveeti < >>> [email protected]> wrote: >>> >>> sampsize uses the same sample for all the trees in the random Forest. >>>> >>>> But I want to use different sample for each tree of the 500 trees in the >>>> random Forest. Thanks! >>>> >>>> >>>> Anirudh Kondaveeti >>>> ---------------------------- >>>> >>>> >>>> 2009/3/20 Uwe Ligges <[email protected]> >>>> >>>> >>>> Anirudh Kondaveeti wrote: >>>>> >>>>> Hi! >>>>>> >>>>>> I am dealing with random forest using R. >>>>>> >>>>>> Is there a way to sample a fixed no.of rows from a dataset for use >>>>>> with >>>>>> different trees in random Forest. >>>>>> To be more clear, my data set contains 1500 rows, and I am growing 500 >>>>>> trees >>>>>> in Random Forest >>>>>> Is it possible to sample only 500 rows of data from the data set and >>>>>> use >>>>>> it >>>>>> for different trees in the forest. I mean each tree of the forest >>>>>> should >>>>>> use >>>>>> a different 500 rows from the data set. >>>>>> >>>>>> >>>>> See ?randomForest and the argument sampsize. >>>>> >>>>> Uwe Ligges >>>>> >>>>> >>>>> >>>>> >>>>> Thanks in advance! >>>>>> >>>>>> Anirudh Kondaveeti >>>>>> ---------------------------- >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> [email protected] mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> >>> >> [[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

