> x <- matrix(c(1:4)) > quantile(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% 1.00 1.75 2.50 3.25 4.00
> x <- matrix(c(1:6)) > quantile(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% 1.00 2.25 3.50 4.75 6.00
> x <- matrix(c(1:8)) > quantile(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% 1.00 2.75 4.50 6.25 8.00
With your implicit definition of quantiles (splitting the data set into classes of equal size), each class should have 1.5 observations, so that the quantiles should be
> x <- matrix(c(1:4)) > equalSizeClasses(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% -Inf 1.50 2.50 3.50 +Inf
> x <- matrix(c(1:6)) > equalSizeClasses(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% -Inf 2.00 3.50 5.00 +Inf
> x <- matrix(c(1:8)) > equalSizeClasses(x,c(0,.25,.5,.75,1)) 0% 25% 50% 75% 100% -Inf 2.50 4.50 6.50 +Inf
Knut
At 09:30 2004-02-06 -0600, Giovanni Petris wrote:
I am trying to `cut' a continuous variable into contiguous classes containing approximately an equal number of observations. I thought quantile() was the appropriate function to use in order to find the breakpoints, but I end up with classes of different sizes - see example below. Does anybody have an explanation for that? And what is the `recommended' way of computing what I am looking for?
Example:
> ca$age
[1] 28 42 46 45 34 44 48 45 38 45 49 45 41 46 49 46 44 48 52 48 45 50 53 57 46
[26] 52 54 57 47 52 55 59 50 54 57 60 51 55 46 63 51 59 48 35 53 59 57 37 55 32
[51] 60 43 59 37 30 47 60 38 34 48 32 38 36 49 33 42 38 58 35 43 39 59 39 43 42
[76] 60 40 44
> table(cut(ca$age,breaks=c(-Inf,quantile(ca$age, seq(0,1,length=11)[-1]))))
(-Inf,35] (35,38.4] (38.4,43] (43,45] (45,46.5] (46.5,49] (49,52] (52,55]
9 7 10 8 5 10 7 7
(55,59] (59,63]
10 5
Thanks in advance, Giovanni
--
__________________________________________________ [ ] [ Giovanni Petris [EMAIL PROTECTED] ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__________________________________________________]
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Knut M. Wittkowski, PhD,DSc ------------------------------------------ The Rockefeller University, GCRC Experimental Design and Biostatistics 1230 York Ave #121B, Box 322, NY,NY 10021 +1(212)327-7175, +1(212)327-8450 (Fax) [EMAIL PROTECTED] http://www.rucares.org/clinicalresearch/dept/biometry/
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
