[R] puzzle with integrate over infinite range
Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
You could try pnorm also: shiftedGaussR - function(x0 = 500) { sd - 100/sqrt(2) int - pnorm(0, x0, sd, lower.tail=FALSE, log.p=TRUE) exp(int + log(sd) + 0.5 * log(2*pi)) } shiftedGaussR(500) [1] 177.2454 shiftedGauss(500) [1] 177.2454 -Matt On Tue, 2010-09-21 at 09:38 -0400, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
You are dealing with functions that are non-zero over a very small interval, so you have to be very careful. There is no method that is going to be totally foolproof. Having said that, I have always felt that the default tolerance in integrate is too liberal (i.e. too large). I always use rel.tol of 1.e-08 (roughly, sqrt(machine epsilon)) in my computations, and I also increase subdivisions to 500. Ravi. From: baptiste Auguié [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, September 21, 2010 9:58 AM To: Ravi Varadhan Cc: 'baptiste auguie'; 'r-help' Subject: Re: [R] puzzle with integrate over infinite range Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
Thanks, I'll do that too from now on. It strikes me that in a case such as this one it may be safer to use a truncated, finite interval around the region where the integrand is non-zero, rather than following the advice of ?integrate to use Inf as integration limit. At least one wouldn't risk to get an entirely wrong result depending on a choice of rel.tol. Regarding this parameter, is there a simple interpretation of how it affected the result in the context of my example? Thanks again, baptiste On Sep 21, 2010, at 4:04 PM, Ravi Varadhan wrote: You are dealing with functions that are non-zero over a very small interval, so you have to be very careful. There is no method that is going to be totally foolproof. Having said that, I have always felt that the default tolerance in integrate is too liberal (i.e. too large). I always use rel.tol of 1.e-08 (roughly, sqrt(machine epsilon)) in my computations, and I also increase subdivisions to 500. Ravi. From: baptiste Auguié [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, September 21, 2010 9:58 AM To: Ravi Varadhan Cc: 'baptiste auguie'; 'r-help' Subject: Re: [R] puzzle with integrate over infinite range Hi, Thanks for the tip, but it's still mysterious to me. Reading ?integrate did not give me much hint as to what relative accuracy means in this context. I looked at the source of integrate.c but it's still not clear to me how the default value of rel.tol (10^-4 for me) is not enough to prevent a completely wrong answer (the error is much larger than this). Obviously, I'm worried now that I may not always choose a good value of ref.tol, if picked arbitrarily without my understanding of what it means. Thanks, baptiste On Sep 21, 2010, at 3:38 PM, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
On Tue, 21 Sep 2010, baptiste Auguié wrote: Thanks, I'll do that too from now on. It strikes me that in a case such as this one it may be safer to use a truncated, finite interval around the region where the integrand is non-zero, rather than following the advice of ?integrate to use Inf as integration limit. At least one wouldn't risk to get an entirely wrong result depending on a choice of rel.tol. Regarding this parameter, is there a simple interpretation of how it affected the result in the context of my example? Not really. If you know where the integrand is non-zero then you can shift it so that integrate() can handle it. If you don't know then you can't get the truncated interval right. The truncation approach works well for the Normal density because it it is non-negative, symmetric, and has nearly bounded support. The truncation error goes down extremely fast and if the mode of the density is in the center of the interval then all the mass can easily be found. If you have a function with multiple modes and heavier tails it is harder to get an interval that is large enough to make the truncation error small, and still allows the integrate() function to find all the mass. -thomas Thomas Lumley Professor of Biostatistics University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
I see, thank you. I'm still worried by the very dramatic error I obtained just from shifting so slightly the support of the integrand, it took me a while to figure what happened even with this basic example (I knew the integral couldn't be so small!). For a general integration in [0, infty), there must be a link between the number of quadrature points, the transformation applied to the integrand, and the region (position, width) where the measurable integrand needs to be in order to be sampled by the quadrature. I guess this kind of info lies deep inside the source code though. Thanks, baptiste On 21 September 2010 19:00, Thomas Lumley tlum...@u.washington.edu wrote: On Tue, 21 Sep 2010, baptiste Auguié wrote: Thanks, I'll do that too from now on. It strikes me that in a case such as this one it may be safer to use a truncated, finite interval around the region where the integrand is non-zero, rather than following the advice of ?integrate to use Inf as integration limit. At least one wouldn't risk to get an entirely wrong result depending on a choice of rel.tol. Regarding this parameter, is there a simple interpretation of how it affected the result in the context of my example? Not really. If you know where the integrand is non-zero then you can shift it so that integrate() can handle it. If you don't know then you can't get the truncated interval right. The truncation approach works well for the Normal density because it it is non-negative, symmetric, and has nearly bounded support. The truncation error goes down extremely fast and if the mode of the density is in the center of the interval then all the mass can easily be found. If you have a function with multiple modes and heavier tails it is harder to get an interval that is large enough to make the truncation error small, and still allows the integrate() function to find all the mass. -thomas Thomas Lumley Professor of Biostatistics University of Washington, Seattle -- Dr. Baptiste Auguié Departamento de Química Física, Universidade de Vigo, Campus Universitario, 36310, Vigo, Spain tel: +34 9868 18617 http://webs.uvigo.es/coloides __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
On 21/09/2010 1:29 PM, baptiste auguie wrote: I see, thank you. I'm still worried by the very dramatic error I obtained just from shifting so slightly the support of the integrand, it took me a while to figure what happened even with this basic example (I knew the integral couldn't be so small!). For a general integration in [0, infty), there must be a link between the number of quadrature points, the transformation applied to the integrand, and the region (position, width) where the measurable integrand needs to be in order to be sampled by the quadrature. I guess this kind of info lies deep inside the source code though. You probably do have to look at the source code to find out the rule, but it is very easy to find out how it applies for any particular set of parameters: Just have your objective function print or save all the points where it is asked to evaluate itself. Plotting those will be very informative. Duncan Murdoch Thanks, baptiste On 21 September 2010 19:00, Thomas Lumleytlum...@u.washington.edu wrote: On Tue, 21 Sep 2010, baptiste Auguié wrote: Thanks, I'll do that too from now on. It strikes me that in a case such as this one it may be safer to use a truncated, finite interval around the region where the integrand is non-zero, rather than following the advice of ?integrate to use Inf as integration limit. At least one wouldn't risk to get an entirely wrong result depending on a choice of rel.tol. Regarding this parameter, is there a simple interpretation of how it affected the result in the context of my example? Not really. If you know where the integrand is non-zero then you can shift it so that integrate() can handle it. If you don't know then you can't get the truncated interval right. The truncation approach works well for the Normal density because it it is non-negative, symmetric, and has nearly bounded support. The truncation error goes down extremely fast and if the mode of the density is in the center of the interval then all the mass can easily be found. If you have a function with multiple modes and heavier tails it is harder to get an interval that is large enough to make the truncation error small, and still allows the integrate() function to find all the mass. -thomas Thomas Lumley Professor of Biostatistics University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.