On 7/7/2005 3:47 PM, Weiwei Shi wrote: > it works. > thanks, > > but: (just curious) > why i tried previously and i got > >> is.vector(sample.size) > [1] TRUE > > i also tried as.vector(sample.size) and assigned it to sampsz,it still > does not work.
Sorry, I used "vector" incorrectly. Lists are vectors. What sum needs is a numeric or complex vector, and lists are vectors of objects, not vectors of numbers. You should use is.numeric(sample.size) to test whether you can sum sample.size. Duncan Murdoch > > On 7/7/05, Duncan Murdoch <[EMAIL PROTECTED]> wrote: >> On 7/7/2005 3:38 PM, Weiwei Shi wrote: >> > Hi there: >> > I have a question on random foresst: >> > >> > recently i helped a friend with her random forest and i came with this >> > problem: >> > her dataset has 6 classes and since the sample size is pretty small: >> > 264 and the class distr is like this (Diag is the response variable) >> > sample.size <- lapply(1:6, function(i) sum(Diag==i)) >> >> sample.size >> > [[1]] >> > [1] 36 >> > >> > [[2]] >> > [1] 12 >> > >> > [[3]] >> > [1] 120 >> > >> > [[4]] >> > [1] 36 >> > >> > [[5]] >> > [1] 30 >> > >> > [[6]] >> > [1] 30 >> > >> > I assigned this sample.size to sampsz for a stratiefied sampling >> > purpose and i got the following error: >> > Error in sum(..., na.rm = na.rm) : invalid 'mode' of argument >> > >> > if I use sampsz=c(36, 12, 120, 36, 30, 30), then it is fine. Could you >> > tell me why? >> >> The sum() function knows what to do on a vector, but not on a list. You >> can turn your sample.size variable into a vector using >> >> unlist(sample.size) >> >> Duncan Murdoch >> >> > btw, as to classification problem for this with uneven class number >> > situation, do u have some suggestions to improve its accuracy? I >> > tried to use c() way to make the sampsz works but the result is >> > similar. >> > >> > Thanks, >> > >> > weiwei >> > >> > On 6/30/05, Liaw, Andy <[EMAIL PROTECTED]> wrote: >> >> The limitation comes from the way categorical splits are represented in >> >> the >> >> code: For a categorical variable with k categories, the split is >> >> represented by k binary digits: 0=right, 1=left. So it takes k bits to >> >> store each split on k categories. To save storage, this is `packed' into >> >> a >> >> 4-byte integer (32-bit), thus the limit of 32 categories. >> >> >> >> The current Fortran code (version 5.x) by Breiman and Cutler gets around >> >> this limitation by storing the split in an integer array. While this >> >> lifts >> >> the 32-category limit, it takes much more memory to store the splits. I'm >> >> still trying to figure out a more memory efficient way of storing the >> >> splits >> >> without imposing the 32-category limit. If anyone has suggestions, I'm >> >> all >> >> ears. >> >> >> >> Best, >> >> Andy >> >> >> >> > From: [EMAIL PROTECTED] >> >> > >> >> > Hello, >> >> > >> >> > I'm using the random forest package. One of my factors in the >> >> > data set contains 41 levels (I can't code this as a numeric >> >> > value - in terms of linear models this would be a random >> >> > factor). The randomForest call comes back with an error >> >> > telling me that the limit is 32 categories. >> >> > >> >> > Is there any reason for this particular limit? Maybe it's >> >> > possible to recompile the module with a different cutoff? >> >> > >> >> > thanks a lot for your help, >> >> > kind regards, >> >> > >> >> > >> >> > Arne >> >> > >> >> > ______________________________________________ >> >> > [email protected] mailing list >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide! >> >> > http://www.R-project.org/posting-guide.html >> >> > >> >> > >> >> > >> >> >> >> ______________________________________________ >> >> [email protected] mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide! >> >> http://www.R-project.org/posting-guide.html >> >> >> > >> > >> >> > > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
