[R] using integrate in a function definition

2007-02-23 Thread theo borm
Dear list members,

I'm quite new to R, and though I tried to find the answer to my probably 
very basic question through the available resources (website, mailing 
list archives, docs, google), I've not found it.


If I try to use the integrate function from within my own functions, 
my functions seem to misbehave in some contexts. The following example 
is a bit silly, but illustrates what's happening:


The following behaves as expected:

  jjj-function(www) {2*integrate(dnorm,0,www)$value}
  kkk-function(www) {2*(pnorm(www)-0.5)}
  jjj(2)
[1] 0.9544997
  kkk(2)
[1] 0.9544997



However, if I do:
  plot(jjj,0,5)
Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' and 'y' lengths differ

whereas
  plot(kkk,0,5)
produces a nice plot.




  xxx-seq(0:5)
  yyy-jjj(xxx)
  zzz-kkk(xxx)

produces no errors, but:
  yyy
[1] 0.6826895
  zzz
[1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000




Why is this? Is this some R language feature that I've completely missed?




Ultimately my problem is that I have a mathematical function describing 
a distribution, and I want to use this in a Kolmogorov-Smirnov test 
where I need a cumulative distribution function. If I use the following 
(synthetic) dataset with ks.test with either the jjj or kkk 
function, I get:

  ppp-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853,
0.72023272,0.40245392,0.16002948,0.24203950,0.44029851,
0.34830446,1.66967152,1.71609574,1.17540830,0.22306379,
0.64382628,0.37382795,0.84361547,1.92138362,0.02844235,
0.11238473,0.21237557,1.01058073,2.33108243,1.36034473,
1.88951679,0.18230647,0.96571916,0.91239760,2.05574766,
0.33681130,2.76006257,0.83952491,1.24038831,1.46199671,
0.24694163,0.01565159,0.94816108,1.04642673,0.36556444,
2.20760578,1.59518590,0.83951030,0.07113652,0.97422886,
0.59835793,0.08383890,1.09180853,0.43508943,0.35368637,
0.75805978,0.12790161,1.56798563,1.68669770,0.56168021)
  ks.test(ppp,kkk)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.1013, p-value = 0.5895
alternative hypothesis: two.sided

[ which seems correct as the samples come from abs(rnorm()) ]

  ks.test(ppp,jjj)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.9875, p-value  2.2e-16
alternative hypothesis: two.sided

[ which is *incorrect* ]


Note: This example is artificial as I have used a function for which I 
know the integral; my real problem involves a distribution that I can 
only integrate numerically.

How would I go about to solve this problem?



With kind regards,

Theo.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using integrate in a function definition

2007-02-23 Thread Ravi Varadhan
Your function jjj is not vectorized.

Try this:

jjj - function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value)
plot(jjj, 0, 5)

It should work.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of theo borm
Sent: Friday, February 23, 2007 1:43 PM
To: r-help@stat.math.ethz.ch
Subject: [R] using integrate in a function definition

Dear list members,

I'm quite new to R, and though I tried to find the answer to my probably 
very basic question through the available resources (website, mailing 
list archives, docs, google), I've not found it.


If I try to use the integrate function from within my own functions, 
my functions seem to misbehave in some contexts. The following example 
is a bit silly, but illustrates what's happening:


The following behaves as expected:

  jjj-function(www) {2*integrate(dnorm,0,www)$value}
  kkk-function(www) {2*(pnorm(www)-0.5)}
  jjj(2)
[1] 0.9544997
  kkk(2)
[1] 0.9544997



However, if I do:
  plot(jjj,0,5)
Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' and 'y' lengths differ

whereas
  plot(kkk,0,5)
produces a nice plot.




  xxx-seq(0:5)
  yyy-jjj(xxx)
  zzz-kkk(xxx)

produces no errors, but:
  yyy
[1] 0.6826895
  zzz
[1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000




Why is this? Is this some R language feature that I've completely missed?




Ultimately my problem is that I have a mathematical function describing 
a distribution, and I want to use this in a Kolmogorov-Smirnov test 
where I need a cumulative distribution function. If I use the following 
(synthetic) dataset with ks.test with either the jjj or kkk 
function, I get:

  ppp-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853,
0.72023272,0.40245392,0.16002948,0.24203950,0.44029851,
0.34830446,1.66967152,1.71609574,1.17540830,0.22306379,
0.64382628,0.37382795,0.84361547,1.92138362,0.02844235,
0.11238473,0.21237557,1.01058073,2.33108243,1.36034473,
1.88951679,0.18230647,0.96571916,0.91239760,2.05574766,
0.33681130,2.76006257,0.83952491,1.24038831,1.46199671,
0.24694163,0.01565159,0.94816108,1.04642673,0.36556444,
2.20760578,1.59518590,0.83951030,0.07113652,0.97422886,
0.59835793,0.08383890,1.09180853,0.43508943,0.35368637,
0.75805978,0.12790161,1.56798563,1.68669770,0.56168021)
  ks.test(ppp,kkk)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.1013, p-value = 0.5895
alternative hypothesis: two.sided

[ which seems correct as the samples come from abs(rnorm()) ]

  ks.test(ppp,jjj)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.9875, p-value  2.2e-16
alternative hypothesis: two.sided

[ which is *incorrect* ]


Note: This example is artificial as I have used a function for which I 
know the integral; my real problem involves a distribution that I can 
only integrate numerically.

How would I go about to solve this problem?



With kind regards,

Theo.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using integrate in a function definition

2007-02-23 Thread Alberto Monteiro
theo borm wrote: 
 
 jjj-function(www) {2*integrate(dnorm,0,www)$value} 
 kkk-function(www) {2*(pnorm(www)-0.5)} 

 xxx-seq(0:5) 
 yyy-jjj(xxx) 
 zzz-kkk(xxx) 
 
 produces no errors, but: 
 yyy 
 [1] 0.6826895 
 zzz 
 [1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000 
 
 Why is this? Is this some R language feature that I've completely missed? 
 
Yes. Some functions work on vectors (and matrices), so 
when you give a vector, it returns a vector. This is true for 
most common functions (sin, cos), arithmetic operations (with 
the caveat that different dimensions for the arguments may cause 
unexpected outcomes) and some internal functions (dnorm, pnorm). 
So, if you write sin(0:10) or dnorm((-3):3), you get a vector. 

Some other functions don't, and this is the case with integrate. 
For example: 

fff - function(x) x 
integrate(fff, 0, 1)  # ok 
integrate(fff, 0, 1:5) # will integrate from 0 to 1 and ignore 2:5 

'plot' will probably fall into some code that uses this 
vector-in-vector-out hypothesis, and then fail when the size 
of x differs from the size of y. 

Alberto Monteiro 

PS: fff - function(x) 1 
integrate(fff, 0, 1)  # error. why?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using integrate in a function definition

2007-02-23 Thread theo borm
Ravi Varadhan wrote:

Your function jjj is not vectorized.

Try this:

jjj - function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value)
plot(jjj, 0, 5)

It should work.
  

Yes it does. Thanks!

Thinking of it, it now starts to make some sort of sense that integrate 
should return a scalar; the result is really a list of which I only 
used the value component. And now I also understand the x,q: vector 
of quantiles. phrases in the documentation of dnorm.

Now if I do something silly:
  jjj-function(www) {2*integrate(dnorm,0,www)$value+sin(www)-sin(www)}
then
  jjj(1:5)
[1] 0.6826895 0.6826895 0.6826895 0.6826895 0.6826895

Evidently the inherritance of being vectorized (which was why my kkk 
function worked) could lead to some unexpected (and probably hard to 
debug) results :-/

Thanks.


with kind regards,

Theo.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using integrate in a function definition

2007-02-23 Thread theo borm
Hi,

Many thanks for the explanation.

Alberto Monteiro wrote:


PS: fff - function(x) 1 
integrate(fff, 0, 1)  # error. why?


Guess: because integrate itself expects a vectorized function ?

  fff(1:5)
[1] 1
  ggg-function(x) { sapply(x, function(x)1) }
  ggg(1:5)
[1] 1 1 1 1 1
  integrate(ggg,0,1)
1 with absolute error  1.1e-14
  hhh-function(x) 1+0*x
  integrate(hhh, 0, 1)
1 with absolute error  1.1e-14

I sense a certain lack of intuitiveness here :-/


regards, Theo

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.