Andy,
Thanks for your email.
I understand that by default, the sampsize variable will use the behavior
variable that we are classifying as the strata variable.
Then, I could set sampsize=c(no=89, yes=11). I implemented that but I got
99% classification error rate on the yes value. When I
If I understand your situation correctly, you may be able to make use of
the strata and sampsize arguments in randomForest() to get bootstrap
samples that resemble the original data distribution. They allow you to
specify stratified samples using the strata variable.
Best,
Andy
From: Raghu
Folks,
I have a query around weighting in Random Forest (RF). I know that several
earlier emails in this group have raised this issue, but I did not find an
answer to my query.
I am working on a dataset (dataset1) that consists of 4 million records that
can be reduced to a dataset (dataset2) of
3 matches
Mail list logo