Re: [Rd] code for sum function

2019-02-21 Thread Gabriel Becker
Hi all,

>From what I can see from my checkout of the Rsources (in src/main/summary.c
as pointed out by others) sums are calculated the "naive" way (see rsum  c
function) but  means are actually calculated something akin to the Neumaier
way (see real_mean c function).

Just an fyi.
~G

On Thu, Feb 21, 2019 at 9:03 AM Ben Bolker  wrote:

> Specifically: https://svn.r-project.org/R/trunk/src/main/summary.c
>
> And if you don't want to deal with Subversion, you can look at the
> read-only github mirror:
>
>
> https://github.com/wch/r-source/blob/e5b21d0397c607883ff25cca379687b86933d730/src/main/summary.c#L115-L131
>
> On Thu, Feb 21, 2019 at 11:57 AM David Winsemius 
> wrote:
> >
> >
> > On 2/20/19 2:55 PM, Rampal Etienne wrote:
> > > Dear Tomas,
> > >
> > > Where do I find these files? Do they contain the code for the sum
> function?
> >
> > Yes.
> >
> > https://svn.r-project.org/R/trunk/
> >
> >
> > David
> >
> > >
> > > What do you mean exactly with your point on long doubles? Where can I
> find
> > > documentation on this?
> > >
> > > Cheers, Rampal
> > >
> > > On Mon, Feb 18, 2019, 15:38 Tomas Kalibera  wrote:
> > >
> > >> See do_summary() in summary.c, rsum() for doubles. R uses long double
> > >> type as accumulator on systems where available.
> > >>
> > >> Best,
> > >> Tomas
> > >>
> > >> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> > >>> Hello,
> > >>>
> > >>> I am trying to write FORTRAN code to do the same as some R code I
> > >>> have. I get (small) differences when using the sum function in R. I
> > >>> know there are numerical routines to improve precision, but I have
> not
> > >>> been able to figure out what algorithm R is using. Does anyone know
> > >>> this? Or where can I find the code for the sum function?
> > >>>
> > >>> Regards,
> > >>>
> > >>> Rampal Etienne
> > >>>
> > >>> __
> > >>> R-devel@r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> > >>
> > >>
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-21 Thread Ben Bolker
Specifically: https://svn.r-project.org/R/trunk/src/main/summary.c

And if you don't want to deal with Subversion, you can look at the
read-only github mirror:

https://github.com/wch/r-source/blob/e5b21d0397c607883ff25cca379687b86933d730/src/main/summary.c#L115-L131

On Thu, Feb 21, 2019 at 11:57 AM David Winsemius  wrote:
>
>
> On 2/20/19 2:55 PM, Rampal Etienne wrote:
> > Dear Tomas,
> >
> > Where do I find these files? Do they contain the code for the sum function?
>
> Yes.
>
> https://svn.r-project.org/R/trunk/
>
>
> David
>
> >
> > What do you mean exactly with your point on long doubles? Where can I find
> > documentation on this?
> >
> > Cheers, Rampal
> >
> > On Mon, Feb 18, 2019, 15:38 Tomas Kalibera  >
> >> See do_summary() in summary.c, rsum() for doubles. R uses long double
> >> type as accumulator on systems where available.
> >>
> >> Best,
> >> Tomas
> >>
> >> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> >>> Hello,
> >>>
> >>> I am trying to write FORTRAN code to do the same as some R code I
> >>> have. I get (small) differences when using the sum function in R. I
> >>> know there are numerical routines to improve precision, but I have not
> >>> been able to figure out what algorithm R is using. Does anyone know
> >>> this? Or where can I find the code for the sum function?
> >>>
> >>> Regards,
> >>>
> >>> Rampal Etienne
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-21 Thread David Winsemius



On 2/20/19 2:55 PM, Rampal Etienne wrote:

Dear Tomas,

Where do I find these files? Do they contain the code for the sum function?


Yes.

https://svn.r-project.org/R/trunk/


David



What do you mean exactly with your point on long doubles? Where can I find
documentation on this?

Cheers, Rampal

On Mon, Feb 18, 2019, 15:38 Tomas Kalibera 
See do_summary() in summary.c, rsum() for doubles. R uses long double
type as accumulator on systems where available.

Best,
Tomas

On 2/14/19 2:08 PM, Rampal Etienne wrote:

Hello,

I am trying to write FORTRAN code to do the same as some R code I
have. I get (small) differences when using the sum function in R. I
know there are numerical routines to improve precision, but I have not
been able to figure out what algorithm R is using. Does anyone know
this? Or where can I find the code for the sum function?

Regards,

Rampal Etienne

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-21 Thread Rampal Etienne
Dear Will,

This is exactly what I find.
My point is thus that the sum function in R is not a naive sum nor a
Kahansum (in all cases), but what algorithm is it using then?

Cheers, Rampal


On Tue, Feb 19, 2019, 19:08 William Dunlap  The algorithm does make a differece.  You can use Kahan's summation
> algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to
> reduce the error compared to the naive summation algorithm.  E.g., in R
> code:
>
> naiveSum <-
> function(x) {
>s <- 0.0
>for(xi in x) s <- s + xi
>s
> }
> kahanSum <- function (x)
> {
>s <- 0.0
>c <- 0.0 # running compensation for lost low-order bits
>for(xi in x) {
>   y <- xi - c
>   t <- s + y # low-order bits of y may be lost here
>   c <- (t - s) - y
>   s <- t
>}
>s
> }
>
> > rSum <- vapply(c(1:20,10^(2:7)), function(n) sum(rep(1/7,n)), 0)
> > rNaiveSum <- vapply(c(1:20,10^(2:7)), function(n) naiveSum(rep(1/7,n)),
> 0)
> > rKahanSum <- vapply(c(1:20,10^(2:7)), function(n) kahanSum(rep(1/7,n)),
> 0)
> >
> > table(rSum == rNaiveSum)
>
> FALSE  TRUE
>21 5
> > table(rSum == rKahanSum)
>
> FALSE  TRUE
> 323
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Tue, Feb 19, 2019 at 10:36 AM Paul Gilbert 
> wrote:
>
>> (I didn't see anyone else answer this, so ...)
>>
>> You can probably find the R code in src/main/ but I'm not sure. You are
>> talking about a very simple calculation, so it seems unlike that the
>> algorithm is the cause of the difference. I have done much more
>> complicated things and usually get machine precision comparisons. There
>> are four possibilities I can think of that could cause (small)
>> differences.
>>
>> 0/ Your code is wrong, but that seems unlikely on such a simple
>> calculations.
>>
>> 1/ You are summing a very large number of numbers, in which case the sum
>> can become very large compared to numbers being added, then things can
>> get a bit funny.
>>
>> 2/ You are using single precision in fortran rather than double. Double
>> is needed for all floating point numbers you use!
>>
>> 3/ You have not zeroed the double precision numbers in fortran. (Some
>> compilers do not do this automatically and you have to specify it.) Then
>> if you accidentally put singles, like a constant 0.0 rather than a
>> constant 0.0D+0, into a double you will have small junk in the lower
>> precision part.
>>
>> (I am assuming you are talking about a sum of reals, not integer or
>> complex.)
>>
>> HTH,
>> Paul Gilbert
>>
>> On 2/14/19 2:08 PM, Rampal Etienne wrote:
>> > Hello,
>> >
>> > I am trying to write FORTRAN code to do the same as some R code I have.
>> > I get (small) differences when using the sum function in R. I know
>> there
>> > are numerical routines to improve precision, but I have not been able
>> to
>> > figure out what algorithm R is using. Does anyone know this? Or where
>> > can I find the code for the sum function?
>> >
>> > Regards,
>> >
>> > Rampal Etienne
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-21 Thread Rampal Etienne
Dear Paul,

Thank you for thinking with me. I will respond to your options:

>
> 0/ Your code is wrong, but that seems unlikely on such a simple
> calculations.
>

It's really just a comparison of the sum function in Fortran with that of
R. If instead I use the naive summation with a for loop in both languages I
get the same answer.

>
> 1/ You are summing a very large number of numbers, in which case the sum
> can become very large compared to numbers being added, then things can
> get a bit funny.
>
Yes, this is what's happening and why I need to know what algorithm R uses
to overcome or reduce these precision issues.

>
> 2/ You are using single precision in fortran rather than double. Double
> is needed for all floating point numbers you use!
>
I use doubles.

>
> 3/ You have not zeroed the double precision numbers in fortran. (Some
> compilers do not do this automatically and you have to specify it.) Then
> if you accidentally put singles, like a constant 0.0 rather than a
> constant 0.0D+0, into a double you will have small junk in the lower
> precision part.
>
It doesn't matter if I double them in this way or not (they are declared as
doubles and I think the compiler treats then as doubles).

So my question remains what algorithm R uses.

Cheers, Rampal

>
> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> > Hello,
> >
> > I am trying to write FORTRAN code to do the same as some R code I have.
> > I get (small) differences when using the sum function in R. I know there
> > are numerical routines to improve precision, but I have not been able to
> > figure out what algorithm R is using. Does anyone know this? Or where
> > can I find the code for the sum function?
> >
> > Regards,
> >
> > Rampal Etienne
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-21 Thread Rampal Etienne
Dear Tomas,

Where do I find these files? Do they contain the code for the sum function?

What do you mean exactly with your point on long doubles? Where can I find
documentation on this?

Cheers, Rampal

On Mon, Feb 18, 2019, 15:38 Tomas Kalibera  See do_summary() in summary.c, rsum() for doubles. R uses long double
> type as accumulator on systems where available.
>
> Best,
> Tomas
>
> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> > Hello,
> >
> > I am trying to write FORTRAN code to do the same as some R code I
> > have. I get (small) differences when using the sum function in R. I
> > know there are numerical routines to improve precision, but I have not
> > been able to figure out what algorithm R is using. Does anyone know
> > this? Or where can I find the code for the sum function?
> >
> > Regards,
> >
> > Rampal Etienne
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-20 Thread Tomas Kalibera
Dear Rampal,

you can download R source code in form of a tarball or from subversion, 
please see
https://cran.r-project.org/doc/manuals/R-admin.html#Obtaining-R
https://cran.r-project.org/doc/manuals/R-admin.html#Using-Subversion-and-rsync

There is also a web access to subversion, so specifically the sum is 
available in
https://svn.r-project.org/R/trunk/src/main/summary.c

The definition of LDOUBLE is here
https://svn.r-project.org/R/trunk/src/include/Defn.h

The index of R manuals is here
https://cran.r-project.org/manuals.html

The online documentation inside R gives for ?sum
"    Loss of accuracy can occur when summing values of different signs:
  this can even occur for sufficiently long integer inputs if the
  partial sums would cause integer overflow.  Where possible
  extended-precision accumulators are used, typically well supported
  with C99 and newer, but possibly platform-dependent.
"

Best,
Tomas


On 2/20/19 11:55 PM, Rampal Etienne wrote:
> Dear Tomas,
>
> Where do I find these files? Do they contain the code for the sum 
> function?
>
> What do you mean exactly with your point on long doubles? Where can I 
> find documentation on this?
>
> Cheers, Rampal
>
> On Mon, Feb 18, 2019, 15:38 Tomas Kalibera   wrote:
>
> See do_summary() in summary.c, rsum() for doubles. R uses long double
> type as accumulator on systems where available.
>
> Best,
> Tomas
>
> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> > Hello,
> >
> > I am trying to write FORTRAN code to do the same as some R code I
> > have. I get (small) differences when using the sum function in R. I
> > know there are numerical routines to improve precision, but I
> have not
> > been able to figure out what algorithm R is using. Does anyone know
> > this? Or where can I find the code for the sum function?
> >
> > Regards,
> >
> > Rampal Etienne
> >
> > __
> > R-devel@r-project.org  mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-20 Thread William Dunlap via R-devel
Someone said it used a possibly platform-dependent
higher-than-double-precision type.

By the way, in my example involving rep(1/3, n) I neglected to include the
most precise
way to calculate the sum: n%/%3 + (n%%3)/3.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Feb 20, 2019 at 2:45 PM Rampal Etienne 
wrote:

> Dear Will,
>
> This is exactly what I find.
> My point is thus that the sum function in R is not a naive sum nor a
> Kahansum (in all cases), but what algorithm is it using then?
>
> Cheers, Rampal
>
>
> On Tue, Feb 19, 2019, 19:08 William Dunlap 
>> The algorithm does make a differece.  You can use Kahan's summation
>> algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to
>> reduce the error compared to the naive summation algorithm.  E.g., in R
>> code:
>>
>> naiveSum <-
>> function(x) {
>>s <- 0.0
>>for(xi in x) s <- s + xi
>>s
>> }
>> kahanSum <- function (x)
>> {
>>s <- 0.0
>>c <- 0.0 # running compensation for lost low-order bits
>>for(xi in x) {
>>   y <- xi - c
>>   t <- s + y # low-order bits of y may be lost here
>>   c <- (t - s) - y
>>   s <- t
>>}
>>s
>> }
>>
>> > rSum <- vapply(c(1:20,10^(2:7)), function(n) sum(rep(1/7,n)), 0)
>> > rNaiveSum <- vapply(c(1:20,10^(2:7)), function(n) naiveSum(rep(1/7,n)),
>> 0)
>> > rKahanSum <- vapply(c(1:20,10^(2:7)), function(n) kahanSum(rep(1/7,n)),
>> 0)
>> >
>> > table(rSum == rNaiveSum)
>>
>> FALSE  TRUE
>>21 5
>> > table(rSum == rKahanSum)
>>
>> FALSE  TRUE
>> 323
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>
>> On Tue, Feb 19, 2019 at 10:36 AM Paul Gilbert 
>> wrote:
>>
>>> (I didn't see anyone else answer this, so ...)
>>>
>>> You can probably find the R code in src/main/ but I'm not sure. You are
>>> talking about a very simple calculation, so it seems unlike that the
>>> algorithm is the cause of the difference. I have done much more
>>> complicated things and usually get machine precision comparisons. There
>>> are four possibilities I can think of that could cause (small)
>>> differences.
>>>
>>> 0/ Your code is wrong, but that seems unlikely on such a simple
>>> calculations.
>>>
>>> 1/ You are summing a very large number of numbers, in which case the sum
>>> can become very large compared to numbers being added, then things can
>>> get a bit funny.
>>>
>>> 2/ You are using single precision in fortran rather than double. Double
>>> is needed for all floating point numbers you use!
>>>
>>> 3/ You have not zeroed the double precision numbers in fortran. (Some
>>> compilers do not do this automatically and you have to specify it.) Then
>>> if you accidentally put singles, like a constant 0.0 rather than a
>>> constant 0.0D+0, into a double you will have small junk in the lower
>>> precision part.
>>>
>>> (I am assuming you are talking about a sum of reals, not integer or
>>> complex.)
>>>
>>> HTH,
>>> Paul Gilbert
>>>
>>> On 2/14/19 2:08 PM, Rampal Etienne wrote:
>>> > Hello,
>>> >
>>> > I am trying to write FORTRAN code to do the same as some R code I
>>> have.
>>> > I get (small) differences when using the sum function in R. I know
>>> there
>>> > are numerical routines to improve precision, but I have not been able
>>> to
>>> > figure out what algorithm R is using. Does anyone know this? Or where
>>> > can I find the code for the sum function?
>>> >
>>> > Regards,
>>> >
>>> > Rampal Etienne
>>> >
>>> > __
>>> > R-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-20 Thread Berend Hasselman
> 
> On 2019-02-19 2:08 p.m., William Dunlap via R-devel wrote:
>> The algorithm does make a differece.  You can use Kahan's summation
>> algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to
>> reduce the error compared to the naive summation algorithm.  E.g., in R
>> code:
>> 
>> naiveSum <-
>> function(x) {
>>   s <- 0.0
>>   for(xi in x) s <- s + xi
>>   s
>> }
>> kahanSum <- function (x)
>> {
>>   s <- 0.0
>>   c <- 0.0 # running compensation for lost low-order bits
>>   for(xi in x) {
>>  y <- xi - c
>>  t <- s + y # low-order bits of y may be lost here
>>  c <- (t - s) - y
>>  s <- t
>>   }
>>   s
>> }
>> 
>>> rSum <- vapply(c(1:20,10^(2:7)), function(n) sum(rep(1/7,n)), 0)
>>> rNaiveSum <- vapply(c(1:20,10^(2:7)), function(n) naiveSum(rep(1/7,n)), 0)
>>> rKahanSum <- vapply(c(1:20,10^(2:7)), function(n) kahanSum(rep(1/7,n)), 0)
>>> 
>>> table(rSum == rNaiveSum)
>> 
>> FALSE  TRUE
>>   21 5
>>> table(rSum == rKahanSum)
>> 
>> FALSE  TRUE
>>323


If you use the vector  c(1,10^100,1,-10^100) as input then
sum, naiveSum or kahanSum will all give an incorrect answer.
All return 0 instead of 2.

>From the wikipedia page we can try the pseudocode given of the modification by 
>Neumaier.
My R version (with a small correction to avoid cancellation?) is

neumaierSum <- function (x)
{
  s <- 0.0
  z <- 0.0 # running compensation for lost low-order bits
  for(xi in x) {
 t <- s + xi
 if( abs(s) >= abs(xi) ){
 b <- (s-t)+xi #  intermediate step needed  in R otherwise cancellation
 z <- z+b  # If sum is bigger, low-order digits of xi are lost.
 } else {
 b <- (xi-t)+s #  intermediate step needed in R otherwise cancellation
 z <- z+b  # else low-order digits of sum are lost
 }
 s <- t
  }
  s+z   # correction only applied once in the very end
}

testx <-  c(1,10^100,1,-10^100)
neumaierSum(testx)

gives 2 as answer.

Berend Hasselman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-19 Thread Ben Bolker


This SO question may be of interest:

https://stackoverflow.com/questions/38589705/difference-between-rs-sum-and-armadillos-accu/

  which points out that sum() isn't doing anything fancy *except* using
extended-precision registers when available.  (Using Kahan's algorithm
does come at a computational cost ...)

On 2019-02-19 2:08 p.m., William Dunlap via R-devel wrote:
> The algorithm does make a differece.  You can use Kahan's summation
> algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to
> reduce the error compared to the naive summation algorithm.  E.g., in R
> code:
> 
> naiveSum <-
> function(x) {
>s <- 0.0
>for(xi in x) s <- s + xi
>s
> }
> kahanSum <- function (x)
> {
>s <- 0.0
>c <- 0.0 # running compensation for lost low-order bits
>for(xi in x) {
>   y <- xi - c
>   t <- s + y # low-order bits of y may be lost here
>   c <- (t - s) - y
>   s <- t
>}
>s
> }
> 
>> rSum <- vapply(c(1:20,10^(2:7)), function(n) sum(rep(1/7,n)), 0)
>> rNaiveSum <- vapply(c(1:20,10^(2:7)), function(n) naiveSum(rep(1/7,n)), 0)
>> rKahanSum <- vapply(c(1:20,10^(2:7)), function(n) kahanSum(rep(1/7,n)), 0)
>>
>> table(rSum == rNaiveSum)
> 
> FALSE  TRUE
>21 5
>> table(rSum == rKahanSum)
> 
> FALSE  TRUE
> 323
> 
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> 
> On Tue, Feb 19, 2019 at 10:36 AM Paul Gilbert  wrote:
> 
>> (I didn't see anyone else answer this, so ...)
>>
>> You can probably find the R code in src/main/ but I'm not sure. You are
>> talking about a very simple calculation, so it seems unlike that the
>> algorithm is the cause of the difference. I have done much more
>> complicated things and usually get machine precision comparisons. There
>> are four possibilities I can think of that could cause (small) differences.
>>
>> 0/ Your code is wrong, but that seems unlikely on such a simple
>> calculations.
>>
>> 1/ You are summing a very large number of numbers, in which case the sum
>> can become very large compared to numbers being added, then things can
>> get a bit funny.
>>
>> 2/ You are using single precision in fortran rather than double. Double
>> is needed for all floating point numbers you use!
>>
>> 3/ You have not zeroed the double precision numbers in fortran. (Some
>> compilers do not do this automatically and you have to specify it.) Then
>> if you accidentally put singles, like a constant 0.0 rather than a
>> constant 0.0D+0, into a double you will have small junk in the lower
>> precision part.
>>
>> (I am assuming you are talking about a sum of reals, not integer or
>> complex.)
>>
>> HTH,
>> Paul Gilbert
>>
>> On 2/14/19 2:08 PM, Rampal Etienne wrote:
>>> Hello,
>>>
>>> I am trying to write FORTRAN code to do the same as some R code I have.
>>> I get (small) differences when using the sum function in R. I know there
>>> are numerical routines to improve precision, but I have not been able to
>>> figure out what algorithm R is using. Does anyone know this? Or where
>>> can I find the code for the sum function?
>>>
>>> Regards,
>>>
>>> Rampal Etienne
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-19 Thread William Dunlap via R-devel
The algorithm does make a differece.  You can use Kahan's summation
algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to
reduce the error compared to the naive summation algorithm.  E.g., in R
code:

naiveSum <-
function(x) {
   s <- 0.0
   for(xi in x) s <- s + xi
   s
}
kahanSum <- function (x)
{
   s <- 0.0
   c <- 0.0 # running compensation for lost low-order bits
   for(xi in x) {
  y <- xi - c
  t <- s + y # low-order bits of y may be lost here
  c <- (t - s) - y
  s <- t
   }
   s
}

> rSum <- vapply(c(1:20,10^(2:7)), function(n) sum(rep(1/7,n)), 0)
> rNaiveSum <- vapply(c(1:20,10^(2:7)), function(n) naiveSum(rep(1/7,n)), 0)
> rKahanSum <- vapply(c(1:20,10^(2:7)), function(n) kahanSum(rep(1/7,n)), 0)
>
> table(rSum == rNaiveSum)

FALSE  TRUE
   21 5
> table(rSum == rKahanSum)

FALSE  TRUE
323


Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Feb 19, 2019 at 10:36 AM Paul Gilbert  wrote:

> (I didn't see anyone else answer this, so ...)
>
> You can probably find the R code in src/main/ but I'm not sure. You are
> talking about a very simple calculation, so it seems unlike that the
> algorithm is the cause of the difference. I have done much more
> complicated things and usually get machine precision comparisons. There
> are four possibilities I can think of that could cause (small) differences.
>
> 0/ Your code is wrong, but that seems unlikely on such a simple
> calculations.
>
> 1/ You are summing a very large number of numbers, in which case the sum
> can become very large compared to numbers being added, then things can
> get a bit funny.
>
> 2/ You are using single precision in fortran rather than double. Double
> is needed for all floating point numbers you use!
>
> 3/ You have not zeroed the double precision numbers in fortran. (Some
> compilers do not do this automatically and you have to specify it.) Then
> if you accidentally put singles, like a constant 0.0 rather than a
> constant 0.0D+0, into a double you will have small junk in the lower
> precision part.
>
> (I am assuming you are talking about a sum of reals, not integer or
> complex.)
>
> HTH,
> Paul Gilbert
>
> On 2/14/19 2:08 PM, Rampal Etienne wrote:
> > Hello,
> >
> > I am trying to write FORTRAN code to do the same as some R code I have.
> > I get (small) differences when using the sum function in R. I know there
> > are numerical routines to improve precision, but I have not been able to
> > figure out what algorithm R is using. Does anyone know this? Or where
> > can I find the code for the sum function?
> >
> > Regards,
> >
> > Rampal Etienne
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-19 Thread Paul Gilbert

(I didn't see anyone else answer this, so ...)

You can probably find the R code in src/main/ but I'm not sure. You are 
talking about a very simple calculation, so it seems unlike that the 
algorithm is the cause of the difference. I have done much more 
complicated things and usually get machine precision comparisons. There 
are four possibilities I can think of that could cause (small) differences.


0/ Your code is wrong, but that seems unlikely on such a simple 
calculations.


1/ You are summing a very large number of numbers, in which case the sum 
can become very large compared to numbers being added, then things can 
get a bit funny.


2/ You are using single precision in fortran rather than double. Double 
is needed for all floating point numbers you use!


3/ You have not zeroed the double precision numbers in fortran. (Some 
compilers do not do this automatically and you have to specify it.) Then 
if you accidentally put singles, like a constant 0.0 rather than a 
constant 0.0D+0, into a double you will have small junk in the lower 
precision part.


(I am assuming you are talking about a sum of reals, not integer or 
complex.)


HTH,
Paul Gilbert

On 2/14/19 2:08 PM, Rampal Etienne wrote:

Hello,

I am trying to write FORTRAN code to do the same as some R code I have. 
I get (small) differences when using the sum function in R. I know there 
are numerical routines to improve precision, but I have not been able to 
figure out what algorithm R is using. Does anyone know this? Or where 
can I find the code for the sum function?


Regards,

Rampal Etienne

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] code for sum function

2019-02-18 Thread Tomas Kalibera
See do_summary() in summary.c, rsum() for doubles. R uses long double 
type as accumulator on systems where available.


Best,
Tomas

On 2/14/19 2:08 PM, Rampal Etienne wrote:

Hello,

I am trying to write FORTRAN code to do the same as some R code I 
have. I get (small) differences when using the sum function in R. I 
know there are numerical routines to improve precision, but I have not 
been able to figure out what algorithm R is using. Does anyone know 
this? Or where can I find the code for the sum function?


Regards,

Rampal Etienne

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] code for sum function

2019-02-18 Thread Rampal Etienne

Hello,

I am trying to write FORTRAN code to do the same as some R code I have. 
I get (small) differences when using the sum function in R. I know there 
are numerical routines to improve precision, but I have not been able to 
figure out what algorithm R is using. Does anyone know this? Or where 
can I find the code for the sum function?


Regards,

Rampal Etienne

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel