On Fri, 30 May 2008, Prof Brian Ripley wrote:
> On Fri, 30 May 2008, Duncan Murdoch wrote:
>
> > On 5/30/2008 1:55 PM, Prof Brian Ripley wrote:
> >> Well, R has no unsigned quantities, so ultimately you can't actually do
> >> this. But using what="int" and an appropriate 'size' (likely to be 8)
> >> shold read the numbers, wrapping around very large ones to be negative.
> >> (The usual trick of storing integers in numeric will lose accuracy, but
> >> might be better than nothing.)
> >
> > I think reading size 8 integers on 32 bit Windows returns signed 32 bit
> > integers, with values outside that range losing the high order bits, not
> > just
> > accuracy. At least that's what I see when I write the numbers 1:10 out as 4
> > byte integers, and read them as 8 byte integers: I get 1 3 5 7 9.
>
> Yes, that's true for even larger ones.
>
> So to clarify: up to 2^31-1 should work, thereafter you will get the lower
> 32 bits and hence possibly a signed number.
When we wrote a version of readBin() for Splus 8.0 we added an
extra argument, output=, that specifies the type of S object
to put the result into. The what= argument says what sort
of data is in the input file and by default output=what.
output="double" can be useful in this case, as a double can
store a 53 bit signed or unsigned integer without loss of
precision. If the integer is bigger than 2^53-1, the double
stores its most significant 53 bits, which may be better
than truncating the thing.
E.g., I wrote a C program to write some unsigned long longs to
a file:
#include <stdio.h>
int main(int argc, char *argv[])
{
unsigned long long data[7], one = 1ULL ;
data[0] = one ;
data[1] = (one<<31) - 1 ;
data[2] = (one<<31) + 1 ;
data[3] = (one<<32) - 1 ;
data[4] = (one<<32) + 1 ;
data[5] = (one<<52) + 1 ;
data[6] = (one<<54) + 1 ;
(void)fwrite((void *)data, sizeof(data[0]),
sizeof(data)/sizeof(data[0]), stdout) ;
return 0 ;
}
od shows what it writes, as unsigned, signed, and hex
8 byte integers:
% ./a.out|od --format u8
0000000 1 2147483647
0000020 2147483649 4294967295
0000040 4294967297 4503599627370497
0000060 18014398509481985
0000070
% ./a.out | od --format d8
0000000 1 2147483647
0000020 2147483649 4294967295
0000040 4294967297 4503599627370497
0000060 18014398509481985
0000070
% ./a.out | od --format x8
0000000 0000000000000001 000000007fffffff
0000020 0000000080000001 00000000ffffffff
0000040 0000000100000001 0010000000000001
0000060 0040000000000001
0000070
and in 32-bit Splus I can read it with:
> z<-readBin(pipe("./a.out", open="br"), what="integer", n=7,
size=8, signed=FALSE, output="double")
> print(z, digits=16)
[1] 1 2147483647 2147483649 4294967295
[5] 4294967297 4503599627370497 18014398509481984
Note that it loses precision where z[7]>2^53.
Without the output="double" then the numbers > 2^32 would be
truncated and the signs would be wrong on ones between 2^31
anbd 2^32:
> readBin(pipe("./a.out", open="br"), what="integer", n=7,
size=8, signed=FALSE)
[1] 1 2147483647 -2147483647 -1 1 1
[7] 1
(That one gives the same result in R and Splus.)
What do folks think about having this option in R?
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel