On Dec 14, 2011, at 16:19 , John C Nash wrote:

> 
> Following this thread, I wondered why nobody tried cumsum to see where the 
> integer
> overflow occurs. On the shorter xx vector in the little script below I get a 
> message:
> 
> Warning message:
> Integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
>> 
> 
> But sum() does not give such a warning, which I believe is the point of 
> contention. Since
> cumsum() does manage to give such a warning, and show where the overflow 
> occurs, should
> sum() not be able to do so? For the record, I don't class the non-zero answer 
> as an error
> in itself. I regard the failure to warn as the issue.

It (sum) does warn if you take the two "halves" separately. The issue is that 
the overflow is detected at the end of the summation, when the result is to be 
saved to an integer (which of course happens for all intermediate sums in 
cumsum)

> x <- c(rep(1800000003L, 10000000), -rep(1200000002L, 15000000))
> sum(x[1:10000000])
[1] NA
Warning message:
In sum(x[1:1e+07]) : Integer overflow - use sum(as.numeric(.))
> sum(x[10000001:25000000])
[1] NA
Warning message:
In sum(x[10000001:1.5e+07]) : Integer overflow - use sum(as.numeric(.))
> sum(x)
[1] 4996000

There's a pretty easy fix, essentially to move

    if(s > INT_MAX || s < R_INT_MIN){
        warningcall(call, _("Integer overflow - use sum(as.numeric(.))"));
        *value = NA_INTEGER;
    }

inside the summation loop. Obviously, there's a speed penalty from two FP 
comparisons per element, but I wouldn't know whether it matters in practice for 
anyone.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to