Re: [R] cdf in R give probability of random variable

2016-10-22 Thread Martin Maechler
> peter dalgaard 
> on Fri, 21 Oct 2016 22:10:22 +0200 writes:

>> On 21 Oct 2016, at 20:23 , peter dalgaard
>>  wrote:
>> 
>> In both cases it works out nicer if you do
>> 
>> names(px) <- x
>> barplot(px)

> Um, unless of course you want the cdf as a step function,
> in which case check the help page for plot for possible
> values of the type= argument.

and if it is not homework, (or even then ;-)  just use

  plot(ecdf(x))


Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf in R give probability of random variable

2016-10-21 Thread peter dalgaard

> On 21 Oct 2016, at 20:23 , peter dalgaard  wrote:
> 
> In both cases it works out nicer if you do
> 
> names(px) <- x
> barplot(px)
> 

Um, unless of course you want the cdf as a step function, in which case check 
the help page for plot for possible values of the type= argument.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf in R give probability of random variable

2016-10-21 Thread peter dalgaard
This looks like homework, but doing it in R might not be, so I'll give you the 
benefit of doubt...


> On 21 Oct 2016, at 18:30 , Ragia .  wrote:
> 
> 
> Dear Grooup
> 
> kindly
> 
> how can I plot  these  graphs in R..
> 
> 
> Suppose that X is a discrete random variable with P(X = 0) = .25, P(X = 1) =
> .125, P(X = 2) = .125, and P(X = 3) = .5. Graph the frequency function and
> the cumulative distribution function of X.
> 
> 
> 
> my solution was:
> 
> x=c(0,1,3)
> px=c(.25,.125,.5)
> plot( x,px ,type="h"  ) # to plot the frequency  , is it correct
> 

You forgot X=2 in there (check the sum). 

> how to plot the cdf?
> 

cumsum() is your friend.

In both cases it works out nicer if you do

names(px) <- x
barplot(px)

-pd

>   [[alternative HTML version deleted]]

Notice that plain text posting is strongly preferred in here. HTML messes up 
code badly sometimes.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cdf in R give probability of random variable

2016-10-21 Thread Ragia .

Dear Grooup

kindly

how can I plot  these  graphs in R..


Suppose that X is a discrete random variable with P(X = 0) = .25, P(X = 1) =
.125, P(X = 2) = .125, and P(X = 3) = .5. Graph the frequency function and
the cumulative distribution function of X.



my solution was:

x=c(0,1,3)
px=c(.25,.125,.5)
plot( x,px ,type="h"  ) # to plot the frequency  , is it correct

 how to plot the cdf?



thanks in advance


Ragia



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of Sample Quantile

2011-03-08 Thread peter dalgaard

On Mar 7, 2011, at 19:12 , Bentley Coffey wrote:

 Just to tie up this thread, I wanted to report my solution:
 
 When (n-1)p is an integer, there is a closed form solution:
 pbinom(j-1,n,...)
 
 When it is not an integer, its fairly easy to approximate the solution by
 interpolating between the closed-form solutions: fitting log(1 - probability
 from closed form solution) on an orthogonal polynomial in n.  This is a
 _very_ fast and fairly accurate solution.
 
 Thanks to all who offered their help...


If you have too much time on your hand, Wikipedia has the joint density of two 
order statistics, from which you could probably proceed to find the marginal 
density of a linear combination of two neighboring order stats. Just take a 
large piece of paper and a couple of spare days  Numerical integration 
might do the job, with some care.

-p


 
 On Thu, Feb 17, 2011 at 11:11 PM, Bentley Coffey
 bentleygcof...@gmail.comwrote:
 
 
 Duncan,
 
 I'm not sure how I missed your message. Sorry. What you describe is what I
 do when (n-1)p is an integer so that R computes the sample quantile using a
 single order statistic. My later posting has that exact binomial expression
 in there as a special case. When (n-1)p is not an integer then R uses a
 weighted average of 2 order statistics, in which case I'm left with my
 standing problem...
 
 
 On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch 
 murdoch.dun...@gmail.comwrote:
 
 On 14/02/2011 9:58 AM, Bentley Coffey wrote:
 
 I need to calculate the probability that a sample quantile will exceed a
 threshold given the size of the iid sample and the parameters describing
 the
 distribution of each observation (normal, in my case). I can compute the
 probability with brute force simulation: simulate a size N sample, apply
 R's
 quantile() function on it, compare it to the threshold, replicate this
 MANY
 times, and count the number of times the sample quantile exceeded the
 threshold (dividing by the total number of replications yields the
 probability of interest). The problem is that the number of replications
 required to get sufficient precision (3 digits say) is so HUGE that this
 takes FOREVER. I have to perform this task so much in my script
 (searching
 over the sample size and repeated for several different distribution
 parameters) that it takes too many hours to run.
 
 I've searched for pre-existing code to do this in R and haven't found
 anything. Perhaps I'm missing something. Is anyone aware of an R function
 to
 compute this probability?
 
 I've tried writing my own code using the fact that R's quantile()
 function
 is a linear combination of 2 order statistics. Basically, I wrote down
 the
 mathematical form of the joint pdf for the 2 order statistics (a function
 of
 the sample size and the distribution parameters) then performed a
 pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
 random draws) over the region where the sample quantile exceeds the
 threshold. In theory, this should work and it takes about 1000 times
 fewer
 clock cycles to compute than the Brute Force approach. My problem is that
 there is a significant discrepancy between the results using Brute Force
 and
 using this more efficient approach that I have coded up. I believe that
 the
 problem is numerical error but it could be some programming bug;
 regardless,
 I have been unable to locate the source of this problem and have spent
 over
 20 hours trying to identify it this weekend. Please, somebody help!!!
 
 So, again, my question: is there code in R for quickly evaluating the CDF
 of
 a Sample Quantile given the sample size and the parameters governing the
 distribution of each iid point in the sample?
 
 
 I think the answer to your question is no, but I think it's the wrong
 question.  Suppose Xm:n is the mth sample quantile in a sample of size n,
 and you want to calculate P(Xm:n  x).  Let X be a draw from the underlying
 distribution, and suppose P(X  x) = p.  Then the event Xm:n  x
 is the same as the event that out of n independent draws of X, at least
 n-m+1 are bigger than x:  a binomial probability.  And R can calculate
 binomial probabilities using pbinom().
 
 Duncan Murdoch
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CDF of Sample Quantile

2011-03-07 Thread Bentley Coffey
Just to tie up this thread, I wanted to report my solution:

When (n-1)p is an integer, there is a closed form solution:
pbinom(j-1,n,...)

When it is not an integer, its fairly easy to approximate the solution by
interpolating between the closed-form solutions: fitting log(1 - probability
from closed form solution) on an orthogonal polynomial in n.  This is a
_very_ fast and fairly accurate solution.

Thanks to all who offered their help...

On Thu, Feb 17, 2011 at 11:11 PM, Bentley Coffey
bentleygcof...@gmail.comwrote:


 Duncan,

 I'm not sure how I missed your message. Sorry. What you describe is what I
 do when (n-1)p is an integer so that R computes the sample quantile using a
 single order statistic. My later posting has that exact binomial expression
 in there as a special case. When (n-1)p is not an integer then R uses a
 weighted average of 2 order statistics, in which case I'm left with my
 standing problem...


 On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch 
 murdoch.dun...@gmail.comwrote:

 On 14/02/2011 9:58 AM, Bentley Coffey wrote:

 I need to calculate the probability that a sample quantile will exceed a
 threshold given the size of the iid sample and the parameters describing
 the
 distribution of each observation (normal, in my case). I can compute the
 probability with brute force simulation: simulate a size N sample, apply
 R's
 quantile() function on it, compare it to the threshold, replicate this
 MANY
 times, and count the number of times the sample quantile exceeded the
 threshold (dividing by the total number of replications yields the
 probability of interest). The problem is that the number of replications
 required to get sufficient precision (3 digits say) is so HUGE that this
 takes FOREVER. I have to perform this task so much in my script
 (searching
 over the sample size and repeated for several different distribution
 parameters) that it takes too many hours to run.

 I've searched for pre-existing code to do this in R and haven't found
 anything. Perhaps I'm missing something. Is anyone aware of an R function
 to
 compute this probability?

 I've tried writing my own code using the fact that R's quantile()
 function
 is a linear combination of 2 order statistics. Basically, I wrote down
 the
 mathematical form of the joint pdf for the 2 order statistics (a function
 of
 the sample size and the distribution parameters) then performed a
 pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
 random draws) over the region where the sample quantile exceeds the
 threshold. In theory, this should work and it takes about 1000 times
 fewer
 clock cycles to compute than the Brute Force approach. My problem is that
 there is a significant discrepancy between the results using Brute Force
 and
 using this more efficient approach that I have coded up. I believe that
 the
 problem is numerical error but it could be some programming bug;
 regardless,
 I have been unable to locate the source of this problem and have spent
 over
 20 hours trying to identify it this weekend. Please, somebody help!!!

 So, again, my question: is there code in R for quickly evaluating the CDF
 of
 a Sample Quantile given the sample size and the parameters governing the
 distribution of each iid point in the sample?


 I think the answer to your question is no, but I think it's the wrong
 question.  Suppose Xm:n is the mth sample quantile in a sample of size n,
 and you want to calculate P(Xm:n  x).  Let X be a draw from the underlying
 distribution, and suppose P(X  x) = p.  Then the event Xm:n  x
 is the same as the event that out of n independent draws of X, at least
 n-m+1 are bigger than x:  a binomial probability.  And R can calculate
 binomial probabilities using pbinom().

 Duncan Murdoch




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of Sample Quantile

2011-02-17 Thread Bentley Coffey
Duncan,

I'm not sure how I missed your message. Sorry. What you describe is what I
do when (n-1)p is an integer so that R computes the sample quantile using a
single order statistic. My later posting has that exact binomial expression
in there as a special case. When (n-1)p is not an integer then R uses a
weighted average of 2 order statistics, in which case I'm left with my
standing problem...

On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 14/02/2011 9:58 AM, Bentley Coffey wrote:

 I need to calculate the probability that a sample quantile will exceed a
 threshold given the size of the iid sample and the parameters describing
 the
 distribution of each observation (normal, in my case). I can compute the
 probability with brute force simulation: simulate a size N sample, apply
 R's
 quantile() function on it, compare it to the threshold, replicate this
 MANY
 times, and count the number of times the sample quantile exceeded the
 threshold (dividing by the total number of replications yields the
 probability of interest). The problem is that the number of replications
 required to get sufficient precision (3 digits say) is so HUGE that this
 takes FOREVER. I have to perform this task so much in my script (searching
 over the sample size and repeated for several different distribution
 parameters) that it takes too many hours to run.

 I've searched for pre-existing code to do this in R and haven't found
 anything. Perhaps I'm missing something. Is anyone aware of an R function
 to
 compute this probability?

 I've tried writing my own code using the fact that R's quantile() function
 is a linear combination of 2 order statistics. Basically, I wrote down the
 mathematical form of the joint pdf for the 2 order statistics (a function
 of
 the sample size and the distribution parameters) then performed a
 pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
 random draws) over the region where the sample quantile exceeds the
 threshold. In theory, this should work and it takes about 1000 times fewer
 clock cycles to compute than the Brute Force approach. My problem is that
 there is a significant discrepancy between the results using Brute Force
 and
 using this more efficient approach that I have coded up. I believe that
 the
 problem is numerical error but it could be some programming bug;
 regardless,
 I have been unable to locate the source of this problem and have spent
 over
 20 hours trying to identify it this weekend. Please, somebody help!!!

 So, again, my question: is there code in R for quickly evaluating the CDF
 of
 a Sample Quantile given the sample size and the parameters governing the
 distribution of each iid point in the sample?


 I think the answer to your question is no, but I think it's the wrong
 question.  Suppose Xm:n is the mth sample quantile in a sample of size n,
 and you want to calculate P(Xm:n  x).  Let X be a draw from the underlying
 distribution, and suppose P(X  x) = p.  Then the event Xm:n  x
 is the same as the event that out of n independent draws of X, at least
 n-m+1 are bigger than x:  a binomial probability.  And R can calculate
 binomial probabilities using pbinom().

 Duncan Murdoch



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CDF of Sample Quantile

2011-02-14 Thread Bentley Coffey
I need to calculate the probability that a sample quantile will exceed a
threshold given the size of the iid sample and the parameters describing the
distribution of each observation (normal, in my case). I can compute the
probability with brute force simulation: simulate a size N sample, apply R's
quantile() function on it, compare it to the threshold, replicate this MANY
times, and count the number of times the sample quantile exceeded the
threshold (dividing by the total number of replications yields the
probability of interest). The problem is that the number of replications
required to get sufficient precision (3 digits say) is so HUGE that this
takes FOREVER. I have to perform this task so much in my script (searching
over the sample size and repeated for several different distribution
parameters) that it takes too many hours to run.

I've searched for pre-existing code to do this in R and haven't found
anything. Perhaps I'm missing something. Is anyone aware of an R function to
compute this probability?

I've tried writing my own code using the fact that R's quantile() function
is a linear combination of 2 order statistics. Basically, I wrote down the
mathematical form of the joint pdf for the 2 order statistics (a function of
the sample size and the distribution parameters) then performed a
pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
random draws) over the region where the sample quantile exceeds the
threshold. In theory, this should work and it takes about 1000 times fewer
clock cycles to compute than the Brute Force approach. My problem is that
there is a significant discrepancy between the results using Brute Force and
using this more efficient approach that I have coded up. I believe that the
problem is numerical error but it could be some programming bug; regardless,
I have been unable to locate the source of this problem and have spent over
20 hours trying to identify it this weekend. Please, somebody help!!!

So, again, my question: is there code in R for quickly evaluating the CDF of
a Sample Quantile given the sample size and the parameters governing the
distribution of each iid point in the sample?

Grateful for any help,

Bentley

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of Sample Quantile

2011-02-14 Thread Jonathan P Daily
If I understand this, you have a value x, or a vector of values x, and you 
want to know the CDF that this value is drawn from a normal distribution?

I assume you are drawing from rnorm for your simulations, so look at the 
other functions listed when you ?rnorm.

HTH
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 02/14/2011 09:58:09 AM:

 [image removed] 
 
 [R] CDF of Sample Quantile
 
 Bentley Coffey 
 
 to:
 
 r-help
 
 02/14/2011 01:58 PM
 
 Sent by:
 
 r-help-boun...@r-project.org
 
 I need to calculate the probability that a sample quantile will exceed a
 threshold given the size of the iid sample and the parameters describing 
the
 distribution of each observation (normal, in my case). I can compute the
 probability with brute force simulation: simulate a size N sample, apply 
R's
 quantile() function on it, compare it to the threshold, replicate this 
MANY
 times, and count the number of times the sample quantile exceeded the
 threshold (dividing by the total number of replications yields the
 probability of interest). The problem is that the number of replications
 required to get sufficient precision (3 digits say) is so HUGE that this
 takes FOREVER. I have to perform this task so much in my script 
(searching
 over the sample size and repeated for several different distribution
 parameters) that it takes too many hours to run.
 
 I've searched for pre-existing code to do this in R and haven't found
 anything. Perhaps I'm missing something. Is anyone aware of an R 
function to
 compute this probability?
 
 I've tried writing my own code using the fact that R's quantile() 
function
 is a linear combination of 2 order statistics. Basically, I wrote down 
the
 mathematical form of the joint pdf for the 2 order statistics (a 
function of
 the sample size and the distribution parameters) then performed a
 pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
 random draws) over the region where the sample quantile exceeds the
 threshold. In theory, this should work and it takes about 1000 times 
fewer
 clock cycles to compute than the Brute Force approach. My problem is 
that
 there is a significant discrepancy between the results using Brute Force 
and
 using this more efficient approach that I have coded up. I believe that 
the
 problem is numerical error but it could be some programming bug; 
regardless,
 I have been unable to locate the source of this problem and have spent 
over
 20 hours trying to identify it this weekend. Please, somebody help!!!
 
 So, again, my question: is there code in R for quickly evaluating the 
CDF of
 a Sample Quantile given the sample size and the parameters governing the
 distribution of each iid point in the sample?
 
 Grateful for any help,
 
 Bentley
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of Sample Quantile

2011-02-14 Thread Duncan Murdoch

On 14/02/2011 9:58 AM, Bentley Coffey wrote:

I need to calculate the probability that a sample quantile will exceed a
threshold given the size of the iid sample and the parameters describing the
distribution of each observation (normal, in my case). I can compute the
probability with brute force simulation: simulate a size N sample, apply R's
quantile() function on it, compare it to the threshold, replicate this MANY
times, and count the number of times the sample quantile exceeded the
threshold (dividing by the total number of replications yields the
probability of interest). The problem is that the number of replications
required to get sufficient precision (3 digits say) is so HUGE that this
takes FOREVER. I have to perform this task so much in my script (searching
over the sample size and repeated for several different distribution
parameters) that it takes too many hours to run.

I've searched for pre-existing code to do this in R and haven't found
anything. Perhaps I'm missing something. Is anyone aware of an R function to
compute this probability?

I've tried writing my own code using the fact that R's quantile() function
is a linear combination of 2 order statistics. Basically, I wrote down the
mathematical form of the joint pdf for the 2 order statistics (a function of
the sample size and the distribution parameters) then performed a
pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
random draws) over the region where the sample quantile exceeds the
threshold. In theory, this should work and it takes about 1000 times fewer
clock cycles to compute than the Brute Force approach. My problem is that
there is a significant discrepancy between the results using Brute Force and
using this more efficient approach that I have coded up. I believe that the
problem is numerical error but it could be some programming bug; regardless,
I have been unable to locate the source of this problem and have spent over
20 hours trying to identify it this weekend. Please, somebody help!!!

So, again, my question: is there code in R for quickly evaluating the CDF of
a Sample Quantile given the sample size and the parameters governing the
distribution of each iid point in the sample?


I think the answer to your question is no, but I think it's the wrong 
question.  Suppose Xm:n is the mth sample quantile in a sample of size 
n, and you want to calculate P(Xm:n  x).  Let X be a draw from the 
underlying distribution, and suppose P(X  x) = p.  Then the event Xm:n  x
is the same as the event that out of n independent draws of X, at least 
n-m+1 are bigger than x:  a binomial probability.  And R can calculate 
binomial probabilities using pbinom().


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of Sample Quantile

2011-02-14 Thread Bentley Coffey
Yeah, I think that you don't understand me. You suggest:

1 - pnorm(Threshold,mean,sd) = Probability that rnorm(1,mean,sd)  Threshold


I want to know:

Probability that quantile(rnorm(n,mean,sd),prob)  Threshold

I use rnorm() to simulate a sample of size n and then I compute the
statistic from that sample using quantile(). Like all statistics, that
quantile stat (which is a weighted average of 2 order statistics) is a
function of the realized data and hence has a sampling distribution. I want
to compute the cdf of that sampling distribution. Even own the David and
Nagaraja _Order Statistics_ text in my library does not have a closed-form
cdf for that statistic...


On Mon, Feb 14, 2011 at 2:20 PM, Jonathan P Daily jda...@usgs.gov wrote:

 If I understand this, you have a value x, or a vector of values x, and you
 want to know the CDF that this value is drawn from a normal distribution?

 I assume you are drawing from rnorm for your simulations, so look at the
 other functions listed when you ?rnorm.

 HTH
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

 r-help-boun...@r-project.org wrote on 02/14/2011 09:58:09 AM:

  [image removed]
 
  [R] CDF of Sample Quantile
 
  Bentley Coffey
 
  to:
 
  r-help
 
  02/14/2011 01:58 PM
 
  Sent by:
 
  r-help-boun...@r-project.org
 
  I need to calculate the probability that a sample quantile will exceed a
  threshold given the size of the iid sample and the parameters describing
 the
  distribution of each observation (normal, in my case). I can compute the
  probability with brute force simulation: simulate a size N sample, apply
 R's
  quantile() function on it, compare it to the threshold, replicate this
 MANY
  times, and count the number of times the sample quantile exceeded the
  threshold (dividing by the total number of replications yields the
  probability of interest). The problem is that the number of replications
  required to get sufficient precision (3 digits say) is so HUGE that this
  takes FOREVER. I have to perform this task so much in my script
 (searching
  over the sample size and repeated for several different distribution
  parameters) that it takes too many hours to run.
 
  I've searched for pre-existing code to do this in R and haven't found
  anything. Perhaps I'm missing something. Is anyone aware of an R
 function to
  compute this probability?
 
  I've tried writing my own code using the fact that R's quantile()
 function
  is a linear combination of 2 order statistics. Basically, I wrote down
 the
  mathematical form of the joint pdf for the 2 order statistics (a
 function of
  the sample size and the distribution parameters) then performed a
  pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's
  random draws) over the region where the sample quantile exceeds the
  threshold. In theory, this should work and it takes about 1000 times
 fewer
  clock cycles to compute than the Brute Force approach. My problem is
 that
  there is a significant discrepancy between the results using Brute Force
 and
  using this more efficient approach that I have coded up. I believe that
 the
  problem is numerical error but it could be some programming bug;
 regardless,
  I have been unable to locate the source of this problem and have spent
 over
  20 hours trying to identify it this weekend. Please, somebody help!!!
 
  So, again, my question: is there code in R for quickly evaluating the
 CDF of
  a Sample Quantile given the sample size and the parameters governing the
  distribution of each iid point in the sample?
 
  Grateful for any help,
 
  Bentley
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CDF calculation from kernel density estimates for a 324X 15 matrix

2010-03-26 Thread Tarana Solaiman
Hi,

I have a 324X15 matrix (No of years vs. heavy precipitation days) and I want
to calculate the cdf at 5 different data points for each row. I tried by the
following codes but it's not working.

heavyprec - read.csv (file=heavyprecdays_CSV.csv,header=TRUE,sep=,)
a - heavyprec
pdf - density (a, bw=SJ, kern= gaussian)
f - approxfun (pdf$x, pdf$y)
cdf - integrate (f, -Inf, 25.25) # I also have to calculate cdf at 30,
33, 37 heavy precipitation days

I know I am misunderstanding the code somewhere. But not being able to
figure it out. Basically I want to calculate bandwidth for each of the 324
rows separately and then I need to calculate the kernel sensity estimate for
each of 324 rows separately and cdf for all those rows separately at 5
different data points. So I guess I have to use for loops too?

This is my 1st  program in R and I am sorry if this problem sounds very
easy.

Any suggestions and help would be highly appreciated!
Thanks very much in advance!






-- 
Tarana Solaiman
PhD Candidate,
Dept. of Civil  Environmental Engineering
The University of Western Ontario
London, ON, Canada
Email: tarana.solai...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf

2009-10-14 Thread David Winsemius


On Oct 13, 2009, at 6:53 PM, Duncan Murdoch wrote:


On 13/10/2009 6:43 PM, David Winsemius wrote:

On Oct 13, 2009, at 5:12 PM, maram salem wrote:

Dear all,
I have the cdf of the following power fuction distribution:
F(y)=(y/350)^a   ,0y350,
where  a  is some parameter with range a0.
I want to use it as the argument of the discretize function of  
the  actuar package.


So I think I need to define this function to R so that if I  
entered  a=1, I get the following

F(y)=(y/350)
and if I entered a=4.5, I get the following
F(y) =(y/350)^4.5
... and so on

I've tried
a-vector(mode=numeric,length=1)
powercdf-function(a,y)
(y/350)^a

But when I typed: powercdf(10,y)
instead of getting : (y/350)^10 (which is what I want)
I got : object y not found ??

I want y to remain as it is, a continous variable, not for  
example  seq(0,350).

Thank you in advance.
If you want symbolic algebra then use a system designed for such.  
If  you invoke a function in R you need to give it arguments for   
evaluation  ... to numerical values.
If you want a function that returns a function, that is also  
possible.

 cdffn - function(y, arg) return( function(y) {y^arg} )


But don't do it like that.  If you do, you'll see things like this:

 power - 10
 cdf10 - cdffn(arg=power)  # don't need y as an argument.
 power - 1
 cdf10(1:10)
[1]  1  2  3  4  5  6  7  8  9 10

See my other post for a correct implementation using force().


Thank you, Duncan. I had seen your post (after hitting send) but had  
not realized how far out of itself a function might look for  
arguments. You did mention the crucial aspect of force but I didn't  
really get it until this further clue. Sometimes I'm a bit dense.


--
David



Duncan Murdoch


 cdf10 - cdffn(y, 10)
 cdf10(1:10)
 [1]   11024   59049 1048576  
9765625 60466176   282475249  1073741824

 [9]  3486784401 100
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cdf

2009-10-13 Thread maram salem
Dear all,
I have the cdf of the following power fuction distribution:
F(y)=(y/350)^a   ,0y350,
where  a  is some parameter with range a0.
I want to use it as the argument of the discretize function of the actuar 
package.

So I think I need to define this function to R so that if I entered a=1, I get 
the following
F(y)=(y/350)
and if I entered a=4.5, I get the following
F(y) =(y/350)^4.5
... and so on 

I've tried 
a-vector(mode=numeric,length=1)
powercdf-function(a,y)
(y/350)^a
 
But when I typed: powercdf(10,y)
instead of getting : (y/350)^10 (which is what I want)
I got : object y not found ??
 
I want y to remain as it is, a continous variable, not for example seq(0,350).
Thank you in advance.
Maram


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf

2009-10-13 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of maram salem
 Sent: Tuesday, October 13, 2009 2:13 PM
 To: r-help
 Subject: [R] cdf
 
 Dear all,
 I have the cdf of the following power fuction distribution:
 F(y)=(y/350)^a   ,0y350,
 where  a  is some parameter with range a0.
 I want to use it as the argument of the discretize function of the actuar 
 package.
 
 So I think I need to define this function to R so that if I entered a=1, I 
 get the
 following
 F(y)=(y/350)
 and if I entered a=4.5, I get the following
 F(y) =(y/350)^4.5
 ... and so on
 
 I've tried
 a-vector(mode=numeric,length=1)
 powercdf-function(a,y)
 (y/350)^a
 
 But when I typed: powercdf(10,y)
 instead of getting : (y/350)^10 (which is what I want)
 I got : object y not found ??
 
 I want y to remain as it is, a continous variable, not for example seq(0,350).
 Thank you in advance.
 Maram
 

You say I want y to remain as it is,   where is y defined outside of your 
function?

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf

2009-10-13 Thread Duncan Murdoch

maram salem wrote:

Dear all,
I have the cdf of the following power fuction distribution:
F(y)=(y/350)^a   ,0y350,
where  a  is some parameter with range a0.
I want to use it as the argument of the discretize function of the actuar 
package.

So I think I need to define this function to R so that if I entered a=1, I get 
the following
F(y)=(y/350)
and if I entered a=4.5, I get the following
F(y) =(y/350)^4.5
... and so on 

I've tried 
a-vector(mode=numeric,length=1)

powercdf-function(a,y)
(y/350)^a
  


You want to return a function of y, so do it like this:

powercdf - function(a) {
 force(a)   # crucial, so that a gets evaluated now.
 return( function(y) (y/350)^a )
}

Duncan Murdoch
 
But when I typed: powercdf(10,y)

instead of getting : (y/350)^10 (which is what I want)
I got : object y not found ??
 
I want y to remain as it is, a continous variable, not for example seq(0,350).

Thank you in advance.
Maram


  
	[[alternative HTML version deleted]]


  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf

2009-10-13 Thread David Winsemius


On Oct 13, 2009, at 5:12 PM, maram salem wrote:


Dear all,
I have the cdf of the following power fuction distribution:
F(y)=(y/350)^a   ,0y350,
where  a  is some parameter with range a0.
I want to use it as the argument of the discretize function of the  
actuar package.


So I think I need to define this function to R so that if I entered  
a=1, I get the following

F(y)=(y/350)
and if I entered a=4.5, I get the following
F(y) =(y/350)^4.5
... and so on

I've tried
a-vector(mode=numeric,length=1)
powercdf-function(a,y)
(y/350)^a

But when I typed: powercdf(10,y)
instead of getting : (y/350)^10 (which is what I want)
I got : object y not found ??

I want y to remain as it is, a continous variable, not for example  
seq(0,350).

Thank you in advance.


If you want symbolic algebra then use a system designed for such. If  
you invoke a function in R you need to give it arguments for  
evaluation  ... to numerical values.


If you want a function that returns a function, that is also possible.

 cdffn - function(y, arg) return( function(y) {y^arg} )
 cdf10 - cdffn(y, 10)
 cdf10(1:10)
 [1]   11024   59049 1048576 9765625 
60466176   282475249  1073741824

 [9]  3486784401 100


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdf

2009-10-13 Thread Duncan Murdoch

On 13/10/2009 6:43 PM, David Winsemius wrote:

On Oct 13, 2009, at 5:12 PM, maram salem wrote:


Dear all,
I have the cdf of the following power fuction distribution:
F(y)=(y/350)^a   ,0y350,
where  a  is some parameter with range a0.
I want to use it as the argument of the discretize function of the  
actuar package.


So I think I need to define this function to R so that if I entered  
a=1, I get the following

F(y)=(y/350)
and if I entered a=4.5, I get the following
F(y) =(y/350)^4.5
... and so on

I've tried
a-vector(mode=numeric,length=1)
powercdf-function(a,y)
(y/350)^a

But when I typed: powercdf(10,y)
instead of getting : (y/350)^10 (which is what I want)
I got : object y not found ??

I want y to remain as it is, a continous variable, not for example  
seq(0,350).

Thank you in advance.


If you want symbolic algebra then use a system designed for such. If  
you invoke a function in R you need to give it arguments for  
evaluation  ... to numerical values.


If you want a function that returns a function, that is also possible.

  cdffn - function(y, arg) return( function(y) {y^arg} )


But don't do it like that.  If you do, you'll see things like this:

 power - 10
 cdf10 - cdffn(arg=power)  # don't need y as an argument.
 power - 1
 cdf10(1:10)
 [1]  1  2  3  4  5  6  7  8  9 10

See my other post for a correct implementation using force().

Duncan Murdoch


  cdf10 - cdffn(y, 10)
  cdf10(1:10)
  [1]   11024   59049 1048576 9765625 
60466176   282475249  1073741824

  [9]  3486784401 100


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.