Dear Naiara and Andy,
My strategy in cases with unbalanced data is:
tmp - as.vector(table(factors));
num_clases - length(tmp);
min_size - tmp[order(tmp,decreasing=FALSE)[1]];
vector_for_sampsize - rep(min_size,num_clases);
Then:
randomForest(..., y=factors, sampsize=vector_for_sampsize)
I hope
...
On 10 mar, 17:00, Liaw, Andy [EMAIL PROTECTED] wrote:
Are you sure there are 100 sites in your data? Here's an example:
R library(randomForest)randomForest4.5-23
Type rfNews() to see new features/changes/bug fixes.
R f - factor(sample(1:4, nrow(iris), replace=TRUE))
R rf1
] On Behalf Of Naiara Pinto
Sent: Sunday, March 09, 2008 5:19 PM
To: r-help@r-project.org
Subject: [R] sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling,
4 matches
Mail list logo