Uwe Ligges wrote:


Anirudh Kondaveeti wrote:
To be more clear,

My data set contains two classes.. Class 1 and Class 2
Class 1 has original data with 300 rows
Class 2 is randomly generated data with 1500 rows.

I want to sample a new data set with
Class 1 - all the rows
Class 2 - only 300 rows out of 1500 rows

and then use it in random forest with 500 trees.

Also the Class 2 should have different 300 rows for different trees in the
forest. Thanks!


Ah, in that case (stratified sampling) combine arguments "strata" and "sampsize", in principle, but you cannot select ALL rows of one class: you somehow ignore one of the main ideas of randomForests to bootstrap observations - and randomForest will certainly bootstrap for you.

In fact, you can also use replace = FALSE as well, but then, as I said, one of the main ideas of randomForest is ignored....

Uwe Ligges





Uwe Ligges



Anirudh Kondaveeti
----------------------------


On Fri, Mar 20, 2009 at 1:45 PM, Anirudh Kondaveeti <
[email protected]> wrote:

sampsize uses the same sample for all the trees in the random Forest.

But I want to use different sample for each tree of the 500 trees in the
random Forest. Thanks!


Anirudh Kondaveeti
----------------------------


2009/3/20 Uwe Ligges <[email protected]>


Anirudh Kondaveeti wrote:

Hi!

I am dealing with random forest using R.

Is there a way to sample a fixed no.of rows from a dataset for use with
different trees in random Forest.
To be more clear, my data set contains 1500 rows, and I am growing 500
trees
in Random Forest
Is it possible to sample only 500 rows of data from the data set and use
it
for different trees in the forest. I mean each tree of the forest should
use
a different 500 rows from the data set.


See ?randomForest and the argument sampsize.

Uwe Ligges




Thanks in advance!

Anirudh Kondaveeti
----------------------------

       [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to