[R] get the percentage rank of a value based on an empirical data vector

2012-01-11 Thread Martin Batholdy
Hi,

I have a vector with values:

x - rnorm(1000, 5, 2)


and one single value:
y - 6.2

now I would like to know the percent rank of y based on the 'population'-vector 
x.
Is there a convenient function that calculates the percent rank of a y for the 
given vector x?


thanks!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the percentage rank of a value based on an empirical data vector

2012-01-11 Thread David Winsemius


On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote:


Hi,

I have a vector with values:

x - rnorm(1000, 5, 2)


and one single value:
y - 6.2

now I would like to know the percent rank of y based on the  
'population'-vector x.
Is there a convenient function that calculates the percent rank of a  
y for the given vector x?


Two options :
1) sort x and use findInterval, divide the index by length(x) and  
multiply by 100

(It can all be done as a one-liner.)

2) I generally reach for the `ecdf` function making machine when I  
see sample quantile problems and see if I can cast the problem in  
terms for which it applies.


For my random draw I get:
 findInterval(6.2, sort(x))
[1] 704
 xecdf - ecdf(x)
 xecdf(6.2)
[1] 0.704

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the percentage rank of a value based on an empirical data vector

2012-01-11 Thread R. Michael Weylandt michael.weyla...@gmail.com
If performance is an issue, I think mean(x  y) will be as quick as it can be 
done in R alone (you could do it in C in a single pass if needed which might be 
a good first exercise in using compiled code)

Michael

On Jan 11, 2012, at 8:58 AM, David Winsemius dwinsem...@comcast.net wrote:

 
 On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote:
 
 Hi,
 
 I have a vector with values:
 
 x - rnorm(1000, 5, 2)
 
 
 and one single value:
 y - 6.2
 
 now I would like to know the percent rank of y based on the 
 'population'-vector x.
 Is there a convenient function that calculates the percent rank of a y for 
 the given vector x?
 
 Two options :
 1) sort x and use findInterval, divide the index by length(x) and multiply by 
 100
 (It can all be done as a one-liner.)
 
 2) I generally reach for the `ecdf` function making machine when I see 
 sample quantile problems and see if I can cast the problem in terms for which 
 it applies.
 
 For my random draw I get:
  findInterval(6.2, sort(x))
 [1] 704
  xecdf - ecdf(x)
  xecdf(6.2)
 [1] 0.704
 
 -- 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the percentage rank of a value based on an empirical data vector

2012-01-11 Thread Martin Batholdy
  findInterval(6.2, sort(x))
 [1] 704
  xecdf - ecdf(x)
  xecdf(6.2)
 [1] 0.704


thanks, that helped a lot!




On 11.01.2012, at 14:58, David Winsemius wrote:

 
 On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote:
 
 Hi,
 
 I have a vector with values:
 
 x - rnorm(1000, 5, 2)
 
 
 and one single value:
 y - 6.2
 
 now I would like to know the percent rank of y based on the 
 'population'-vector x.
 Is there a convenient function that calculates the percent rank of a y for 
 the given vector x?
 
 Two options :
 1) sort x and use findInterval, divide the index by length(x) and multiply by 
 100
 (It can all be done as a one-liner.)
 
 2) I generally reach for the `ecdf` function making machine when I see 
 sample quantile problems and see if I can cast the problem in terms for which 
 it applies.
 
 For my random draw I get:
  findInterval(6.2, sort(x))
 [1] 704
  xecdf - ecdf(x)
  xecdf(6.2)
 [1] 0.704
 
 -- 
 David Winsemius, MD
 West Hartford, CT
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.