[Rd] [r-devel] integrate over an infinite region produces wrong results depending on scaling

Andreï V . Kostyrka Sun, 14 Apr 2019 05:14:31 -0700

Dear all,

This is the first time I am posting to the r-devel list. OnStackOverflow, they suggested that the strange behaviour of integrate()was more bug-like. I am providing a short version of the question (fullone with plots: https://stackoverflow.com/q/55639401).

Suppose one wants integrate a function that is just a product of twodensity functions (like gamma). The support of the random variable is(-Inf, 0]. The scale parameter of the distribution is quite small(around 0.01), so often, the standard integration routine would fail tointegrate a function that is non-zero on a very small section of thenegative line (like [-0.02, -0.01], where it takes huge values, andalmost 0 everywhere else). R’s integrate would often return the machineepsilon as a result. So I stretch the function around the zero by aninverse of the scale parameter, compute the integral, and then divide itby the scale. Sometimes, this re-scaling also failed, so I did both ifthe first result was very small.

Today when integration of the rescaled function suddenly yielded a valueof 1.5 instead of 3.5 (not even zero). The MWE is below.


cons <- -0.020374721416129591
sc <- 0.00271245601724757383
sh <- 5.704

f <- function(x, numstab = 1) dgamma(cons - x * numstab, shape = sh,scale = sc) * dgamma(-x * numstab, shape = sh, scale = sc) * numstab


curve(f, -0.06, 0, n = 501, main = "Unscaled f", bty = "n")
curve(f(x, sc), -0.06 / sc, 0, n = 501, main = "Scaled f", bty = "n")

sum(f(seq(-0.08, 0, 1e-6))) * 1e-6 #  Checking by summation: 3.575294
sum(f(seq(-30, 0, 1e-4), numstab = sc)) * 1e-4 # True value, 3.575294
str(integrate(f, -Inf, 0)) # Gives 3.575294
# $ value       : num 3.58
# $ abs.error   : num 1.71e-06
# $ subdivisions: int 10
str(integrate(f, -Inf, 0, numstab = sc))
# $ value       : num 1.5 # What?!
# $ abs.error   : num 0.000145 # What?!
# $ subdivisions: int 2

It stop at just two subdivisions! The problem is, I cannot try variousstabilising multipliers for the function because I have to compute thisintegral thousands of times for thousands of parameter values onthousands of sample windows for hundreds on models, so even in thesuper-computer cluster, this takes weeks. Besides that, reducing therel.tol just to 1e-5 or 1e-6, helped a bit, but I am not sure whetherthis guarantees success (and reducing it to 1e-7 slowed down thecomputations in some cases). And I have looked at the Fortran code ofthe quadrature just to see the integration rule, and was wondering.

How can I make sure that the integration routine will not produce suchwrong results for such a function, and the integration will still be fast?


Yours sincerely,
Andreï V. Kostyrka

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [r-devel] integrate over an infinite region produces wrong results depending on scaling

Reply via email to