Re: [R] CDF of Sample Quantile

Bentley Coffey Thu, 17 Feb 2011 23:55:22 -0800

Duncan,

I'm not sure how I missed your message. Sorry. What you describe is what I
do when (n-1)p is an integer so that R computes the sample quantile using a
single order statistic. My later posting has that exact binomial expression
in there as a special case. When (n-1)p is not an integer then R uses a
weighted average of 2 order statistics, in which case I'm left with my
standing problem...


On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote:

> On 14/02/2011 9:58 AM, Bentley Coffey wrote:
>
>> I need to calculate the probability that a sample quantile will exceed a
>> threshold given the size of the iid sample and the parameters describing
>> the
>> distribution of each observation (normal, in my case). I can compute the
>> probability with brute force simulation: simulate a size N sample, apply
>> R's
>> quantile() function on it, compare it to the threshold, replicate this
>> MANY
>> times, and count the number of times the sample quantile exceeded the
>> threshold (dividing by the total number of replications yields the
>> probability of interest). The problem is that the number of replications
>> required to get sufficient precision (3 digits say) is so HUGE that this
>> takes FOREVER. I have to perform this task so much in my script (searching
>> over the sample size and repeated for several different distribution
>> parameters) that it takes too many hours to run.
>>
>> I've searched for pre-existing code to do this in R and haven't found
>> anything. Perhaps I'm missing something. Is anyone aware of an R function
>> to
>> compute this probability?
>>
>> I've tried writing my own code using the fact that R's quantile() function
>> is a linear combination of 2 order statistics. Basically, I wrote down the
>> mathematical form of the joint pdf for the 2 order statistics (a function
>> of
>> the sample size and the distribution parameters) then performed a
>> pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
>> random draws) over the region where the sample quantile exceeds the
>> threshold. In theory, this should work and it takes about 1000 times fewer
>> clock cycles to compute than the Brute Force approach. My problem is that
>> there is a significant discrepancy between the results using Brute Force
>> and
>> using this more efficient approach that I have coded up. I believe that
>> the
>> problem is numerical error but it could be some programming bug;
>> regardless,
>> I have been unable to locate the source of this problem and have spent
>> over
>> 20 hours trying to identify it this weekend. Please, somebody help!!!
>>
>> So, again, my question: is there code in R for quickly evaluating the CDF
>> of
>> a Sample Quantile given the sample size and the parameters governing the
>> distribution of each iid point in the sample?
>>
>
> I think the answer to your question is no, but I think it's the wrong
> question.  Suppose Xm:n is the mth sample quantile in a sample of size n,
> and you want to calculate P(Xm:n > x).  Let X be a draw from the underlying
> distribution, and suppose P(X > x) = p.  Then the event Xm:n > x
> is the same as the event that out of n independent draws of X, at least
> n-m+1 are bigger than x:  a binomial probability.  And R can calculate
> binomial probabilities using pbinom().
>
> Duncan Murdoch
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] CDF of Sample Quantile

Reply via email to