[R] using integrate in a function definition
Dear list members, I'm quite new to R, and though I tried to find the answer to my probably very basic question through the available resources (website, mailing list archives, docs, google), I've not found it. If I try to use the integrate function from within my own functions, my functions seem to misbehave in some contexts. The following example is a bit silly, but illustrates what's happening: The following behaves as expected: jjj-function(www) {2*integrate(dnorm,0,www)$value} kkk-function(www) {2*(pnorm(www)-0.5)} jjj(2) [1] 0.9544997 kkk(2) [1] 0.9544997 However, if I do: plot(jjj,0,5) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ whereas plot(kkk,0,5) produces a nice plot. xxx-seq(0:5) yyy-jjj(xxx) zzz-kkk(xxx) produces no errors, but: yyy [1] 0.6826895 zzz [1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000 Why is this? Is this some R language feature that I've completely missed? Ultimately my problem is that I have a mathematical function describing a distribution, and I want to use this in a Kolmogorov-Smirnov test where I need a cumulative distribution function. If I use the following (synthetic) dataset with ks.test with either the jjj or kkk function, I get: ppp-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853, 0.72023272,0.40245392,0.16002948,0.24203950,0.44029851, 0.34830446,1.66967152,1.71609574,1.17540830,0.22306379, 0.64382628,0.37382795,0.84361547,1.92138362,0.02844235, 0.11238473,0.21237557,1.01058073,2.33108243,1.36034473, 1.88951679,0.18230647,0.96571916,0.91239760,2.05574766, 0.33681130,2.76006257,0.83952491,1.24038831,1.46199671, 0.24694163,0.01565159,0.94816108,1.04642673,0.36556444, 2.20760578,1.59518590,0.83951030,0.07113652,0.97422886, 0.59835793,0.08383890,1.09180853,0.43508943,0.35368637, 0.75805978,0.12790161,1.56798563,1.68669770,0.56168021) ks.test(ppp,kkk) One-sample Kolmogorov-Smirnov test data: ppp D = 0.1013, p-value = 0.5895 alternative hypothesis: two.sided [ which seems correct as the samples come from abs(rnorm()) ] ks.test(ppp,jjj) One-sample Kolmogorov-Smirnov test data: ppp D = 0.9875, p-value 2.2e-16 alternative hypothesis: two.sided [ which is *incorrect* ] Note: This example is artificial as I have used a function for which I know the integral; my real problem involves a distribution that I can only integrate numerically. How would I go about to solve this problem? With kind regards, Theo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using integrate in a function definition
Your function jjj is not vectorized. Try this: jjj - function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value) plot(jjj, 0, 5) It should work. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of theo borm Sent: Friday, February 23, 2007 1:43 PM To: r-help@stat.math.ethz.ch Subject: [R] using integrate in a function definition Dear list members, I'm quite new to R, and though I tried to find the answer to my probably very basic question through the available resources (website, mailing list archives, docs, google), I've not found it. If I try to use the integrate function from within my own functions, my functions seem to misbehave in some contexts. The following example is a bit silly, but illustrates what's happening: The following behaves as expected: jjj-function(www) {2*integrate(dnorm,0,www)$value} kkk-function(www) {2*(pnorm(www)-0.5)} jjj(2) [1] 0.9544997 kkk(2) [1] 0.9544997 However, if I do: plot(jjj,0,5) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ whereas plot(kkk,0,5) produces a nice plot. xxx-seq(0:5) yyy-jjj(xxx) zzz-kkk(xxx) produces no errors, but: yyy [1] 0.6826895 zzz [1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000 Why is this? Is this some R language feature that I've completely missed? Ultimately my problem is that I have a mathematical function describing a distribution, and I want to use this in a Kolmogorov-Smirnov test where I need a cumulative distribution function. If I use the following (synthetic) dataset with ks.test with either the jjj or kkk function, I get: ppp-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853, 0.72023272,0.40245392,0.16002948,0.24203950,0.44029851, 0.34830446,1.66967152,1.71609574,1.17540830,0.22306379, 0.64382628,0.37382795,0.84361547,1.92138362,0.02844235, 0.11238473,0.21237557,1.01058073,2.33108243,1.36034473, 1.88951679,0.18230647,0.96571916,0.91239760,2.05574766, 0.33681130,2.76006257,0.83952491,1.24038831,1.46199671, 0.24694163,0.01565159,0.94816108,1.04642673,0.36556444, 2.20760578,1.59518590,0.83951030,0.07113652,0.97422886, 0.59835793,0.08383890,1.09180853,0.43508943,0.35368637, 0.75805978,0.12790161,1.56798563,1.68669770,0.56168021) ks.test(ppp,kkk) One-sample Kolmogorov-Smirnov test data: ppp D = 0.1013, p-value = 0.5895 alternative hypothesis: two.sided [ which seems correct as the samples come from abs(rnorm()) ] ks.test(ppp,jjj) One-sample Kolmogorov-Smirnov test data: ppp D = 0.9875, p-value 2.2e-16 alternative hypothesis: two.sided [ which is *incorrect* ] Note: This example is artificial as I have used a function for which I know the integral; my real problem involves a distribution that I can only integrate numerically. How would I go about to solve this problem? With kind regards, Theo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using integrate in a function definition
theo borm wrote: jjj-function(www) {2*integrate(dnorm,0,www)$value} kkk-function(www) {2*(pnorm(www)-0.5)} xxx-seq(0:5) yyy-jjj(xxx) zzz-kkk(xxx) produces no errors, but: yyy [1] 0.6826895 zzz [1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000 Why is this? Is this some R language feature that I've completely missed? Yes. Some functions work on vectors (and matrices), so when you give a vector, it returns a vector. This is true for most common functions (sin, cos), arithmetic operations (with the caveat that different dimensions for the arguments may cause unexpected outcomes) and some internal functions (dnorm, pnorm). So, if you write sin(0:10) or dnorm((-3):3), you get a vector. Some other functions don't, and this is the case with integrate. For example: fff - function(x) x integrate(fff, 0, 1) # ok integrate(fff, 0, 1:5) # will integrate from 0 to 1 and ignore 2:5 'plot' will probably fall into some code that uses this vector-in-vector-out hypothesis, and then fail when the size of x differs from the size of y. Alberto Monteiro PS: fff - function(x) 1 integrate(fff, 0, 1) # error. why? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using integrate in a function definition
Ravi Varadhan wrote: Your function jjj is not vectorized. Try this: jjj - function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value) plot(jjj, 0, 5) It should work. Yes it does. Thanks! Thinking of it, it now starts to make some sort of sense that integrate should return a scalar; the result is really a list of which I only used the value component. And now I also understand the x,q: vector of quantiles. phrases in the documentation of dnorm. Now if I do something silly: jjj-function(www) {2*integrate(dnorm,0,www)$value+sin(www)-sin(www)} then jjj(1:5) [1] 0.6826895 0.6826895 0.6826895 0.6826895 0.6826895 Evidently the inherritance of being vectorized (which was why my kkk function worked) could lead to some unexpected (and probably hard to debug) results :-/ Thanks. with kind regards, Theo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using integrate in a function definition
Hi, Many thanks for the explanation. Alberto Monteiro wrote: PS: fff - function(x) 1 integrate(fff, 0, 1) # error. why? Guess: because integrate itself expects a vectorized function ? fff(1:5) [1] 1 ggg-function(x) { sapply(x, function(x)1) } ggg(1:5) [1] 1 1 1 1 1 integrate(ggg,0,1) 1 with absolute error 1.1e-14 hhh-function(x) 1+0*x integrate(hhh, 0, 1) 1 with absolute error 1.1e-14 I sense a certain lack of intuitiveness here :-/ regards, Theo __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.