Re: [R] sampsize in Random Forests

2008-04-15 Thread Federico
Dear Naiara and Andy, My strategy in cases with unbalanced data is: tmp - as.vector(table(factors)); num_clases - length(tmp); min_size - tmp[order(tmp,decreasing=FALSE)[1]]; vector_for_sampsize - rep(min_size,num_clases); Then: randomForest(..., y=factors, sampsize=vector_for_sampsize) I hope

Re: [R] sampsize in Random Forests

2008-04-15 Thread Federico
... On 10 mar, 17:00, Liaw, Andy [EMAIL PROTECTED] wrote: Are you sure there are 100 sites in your data? Here's an example: R library(randomForest)randomForest4.5-23 Type rfNews() to see new features/changes/bug fixes. R f - factor(sample(1:4, nrow(iris), replace=TRUE)) R rf1

Re: [R] sampsize in Random Forests

2008-03-10 Thread Liaw, Andy
] On Behalf Of Naiara Pinto Sent: Sunday, March 09, 2008 5:19 PM To: r-help@r-project.org Subject: [R] sampsize in Random Forests Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number

[R] sampsize in Random Forests

2008-03-09 Thread Naiara Pinto
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling,