[R] get the percentage rank of a value based on an empirical data vector
Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
If performance is an issue, I think mean(x y) will be as quick as it can be done in R alone (you could do it in C in a single pass if needed which might be a good first exercise in using compiled code) Michael On Jan 11, 2012, at 8:58 AM, David Winsemius dwinsem...@comcast.net wrote: On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 thanks, that helped a lot! On 11.01.2012, at 14:58, David Winsemius wrote: On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.