Re: [R] rounding down with as.integer

2015-01-02 Thread Duncan Murdoch
On 01/01/2015 10:05 PM, Mike Miller wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:
 
 On 01/01/2015 1:21 PM, Mike Miller wrote:

 I understand that it's all about the problem of representing digital 
 numbers in binary, but I still find some of the results a little 
 surprising, like that list of numbers from the table() output.  For 
 another example:

 1000+3 - 1000*(1+3/1000)
 [1] 1.136868e-13

 3 - 1000*(0+3/1000)
 [1] 0

 2000+3 - 1000*(2+3/1000)
 [1] 0

 See what I mean?  So there is something special about the numbers 
 around 1000.

 I think it's really that there is something special about the numbers 
 near 1, and you're multiplying that by 1000.

 Numbers from 1 to just below 2 are stored as their fractional part, with 
 52 bit precision.  Some intermediate calculations will store them with 
 64 bit precision.  52 bits gives about 15 or 16 decimal places.
 
 
 This is how big those errors are:
 
 512*.Machine$double.eps
 [1] 1.136868e-13
 
 Under other conditions you also were seeing errors of twice that, or 
 1024*.Machine$double.eps.  It might not be a coincidence that the largest 
 number giving me an error was 1023.
 
 2^-43
 [1] 1.136868e-13
 
 .Machine$double.eps
 [1] 2.220446e-16
 
 2^-52
 [1] 2.220446e-16
 
 I guess the 52 comes from the IEEE floating point spec...
 
 http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64
 
 ...but why are we seeing errors so much bigger than the machine precision? 

You are multiplying by 1000.  That magnifies the error.

 Why does it change at 2?

Because (most) floating point numbers are stored as (1 + x) * 2^y, where
x is a number between 0 and 1, and y is an integer value between -1023
and 1023. The value of y changes at 2, and this means errors in x become
twice as big.  (The exceptions are 0, Inf, NaN, etc., as well as
denormals, where y is -1024 and the format changes to x * 2^(y+1).)

Duncan Murdoch

 It doesn't really matter to my work, but it is a curious thing, so I would 
 be interested to learn about it.
 
 Mike


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-02 Thread peter dalgaard

On 02 Jan 2015, at 04:05 , Mike Miller mbmille...@gmail.com wrote:

 
 ...but why are we seeing errors so much bigger than the machine precision? 
 Why does it change at 2?


Because relative errors in the one-thousands part are roughly a thousand times 
bigger than errors in the number itself? Put differently: the 1.136868e-13 
error is an absolute error on 1003, not on 3

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-02 Thread Mike Miller

On Fri, 2 Jan 2015, Duncan Murdoch wrote:


On 01/01/2015 10:05 PM, Mike Miller wrote:



This is how big those errors are:


512*.Machine$double.eps

[1] 1.136868e-13

Under other conditions you also were seeing errors of twice that, or 
1024*.Machine$double.eps.  It might not be a coincidence that the 
largest number giving me an error was 1023.



2^-43

[1] 1.136868e-13


.Machine$double.eps

[1] 2.220446e-16


2^-52

[1] 2.220446e-16

I guess the 52 comes from the IEEE floating point spec...

http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64

...but why are we seeing errors so much bigger than the machine precision?


You are multiplying by 1000.  That magnifies the error.


Why does it change at 2?


Because (most) floating point numbers are stored as (1 + x) * 2^y, where 
x is a number between 0 and 1, and y is an integer value between -1023 
and 1023. The value of y changes at 2, and this means errors in x become 
twice as big.  (The exceptions are 0, Inf, NaN, etc., as well as 
denormals, where y is -1024 and the format changes to x * 2^(y+1).)



That is a great explanation.  Thanks very much!

Mike

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-01 Thread Duncan Murdoch
On 01/01/2015 2:43 PM, Mike Miller wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:
 
 On 01/01/2015 1:21 PM, Mike Miller wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:

 On 31/12/2014 8:44 PM, David Winsemius wrote:

 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but 
 I just ran across this in something I was working on:

 as.integer(1000*1.003)
 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the 
 machine precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )
 [1] 1003

 ...but this seems to always give the same answer and it is a little 
 faster in my application:

 as.integer( 1000*1.003 + .1 )
 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no 
 more than three digits to the right of the decimal.  I'm converting them 
 to integers and saving them in binary format.


 So just add 0.0001 or even .001 to all of them and coerce to integer.

 I don't think the original problem was stated clearly, so I'm not sure
 whether this is a solution, but it looks wrong to me.  If you want to
 round to the nearest integer, why not use round() (without the
 as.integer afterwards)?  Or if you really do want an integer, why add
 0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is the
 classical way to implement round().

 To state the problem clearly, I'd like to know what result is expected
 for any real number x.  Since R's numeric type only approximates the
 real numbers we might not be able to get a perfect match, but at least
 we could quantify how close we get.  Or is the input really character
 data?  The original post mentioned reading numbers from a text file.


 Maybe you'd like to know what I'm really doing.  I have 1600 text files
 each with up to 16,000 lines with 3100 numbers per line, delimited by a
 single space.  The numbers are between 0 and 2, inclusive, and they have
 up to three digits to the right of the decimal.  Every possible value in
 that range will occur in the data.  Some examples numbers: 0 1 2 0.325
 1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers
 (uint16).

 I've been reading in the data like so:

 data - scan( file=FILE, what=double(), nmax=3100*16000)

 At first I tried making the integers like so:

 ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm
 user  system elapsed
0.187   0.387   0.574

 I decided I should compare with the result I got using round():

 ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; 
 proc.time()-ptm
 user  system elapsed
1.595   0.757   2.352

 It is a curious fact that only a few of the values from 0 to 2000 disagree
 between the two methods:

 table( ints2[ ints2 != ints ] )

   1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112

 I understand that it's all about the problem of representing digital
 numbers in binary, but I still find some of the results a little
 surprising, like that list of numbers from the table() output.  For
 another example:

 1000+3 - 1000*(1+3/1000)
 [1] 1.136868e-13

 3 - 1000*(0+3/1000)
 [1] 0

 2000+3 - 1000*(2+3/1000)
 [1] 0

 See what I mean?  So there is something special about the numbers around
 1000.

 I think it's really that there is something special about the numbers
 near 1, and you're multiplying that by 1000.

 Numbers from 1 to just below 2 are stored as their fractional part, with
 52 bit precision.  Some intermediate calculations will store them with
 64 bit precision.  52 bits gives about 15 or 16 decimal places.

 If your number x is close to 3/1000, it is stored as the fractional part
 of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
 so that's why these values are accurate.

 If your number x is close to 2.003, it is stored as the fractional part
 of x/2, i.e. with errors like 1.0015 would have.  So I would have
 guessed that 2.006 would have the same problems as 1.003, but I thought
 you didn't see that.  So I tried it myself, and I do see that:

 1000+3 - 1000*(1+3/1000)
 [1] 1.136868e-13
 2000+6 - 1000*(2+6/1000)
 [1] 2.273737e-13

 Reading more closely, I see that you didn't test this particular case,
 so there's no contradiction here.

 The one thing I couldn't think of an explanation for is why other
 numbers between 1 and 2 don't have the same sorts of problems.  So I
 tried the following:

 # Set data to 1.000 thru 1.999
 data - 1 + 0:999/1000

 # Find the errors
 errors - 1000 + 0:999 - 1000*data

 # Plot them
 plot(data, errors)

 The plot doesn't show a uniform distribution, but much more uniform than 
 yours:  so I think your data doesn't really cover all 

Re: [R] rounding down with as.integer

2015-01-01 Thread Mike Miller

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 01/01/2015 1:21 PM, Mike Miller wrote:

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 31/12/2014 8:44 PM, David Winsemius wrote:


On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:


This is probably a FAQ, and I don't really have a question about it, but I just 
ran across this in something I was working on:


as.integer(1000*1.003)

[1] 1002

I didn't expect it, but maybe I should have.  I guess it's about the machine 
precision added to the fact that as.integer always rounds down:



as.integer(1000*1.003 + 255 * .Machine$double.eps)

[1] 1002


as.integer(1000*1.003 + 256 * .Machine$double.eps)

[1] 1003


This does it right...


as.integer( round( 1000*1.003 ) )

[1] 1003

...but this seems to always give the same answer and it is a little faster in 
my application:


as.integer( 1000*1.003 + .1 )

[1] 1003


FYI - I'm reading in a long vector of numbers from a text file with no more 
than three digits to the right of the decimal.  I'm converting them to integers 
and saving them in binary format.



So just add 0.0001 or even .001 to all of them and coerce to integer.


I don't think the original problem was stated clearly, so I'm not sure
whether this is a solution, but it looks wrong to me.  If you want to
round to the nearest integer, why not use round() (without the
as.integer afterwards)?  Or if you really do want an integer, why add
0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is the
classical way to implement round().

To state the problem clearly, I'd like to know what result is expected
for any real number x.  Since R's numeric type only approximates the
real numbers we might not be able to get a perfect match, but at least
we could quantify how close we get.  Or is the input really character
data?  The original post mentioned reading numbers from a text file.



Maybe you'd like to know what I'm really doing.  I have 1600 text files
each with up to 16,000 lines with 3100 numbers per line, delimited by a
single space.  The numbers are between 0 and 2, inclusive, and they have
up to three digits to the right of the decimal.  Every possible value in
that range will occur in the data.  Some examples numbers: 0 1 2 0.325
1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers
(uint16).

I've been reading in the data like so:


data - scan( file=FILE, what=double(), nmax=3100*16000)


At first I tried making the integers like so:


ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm

user  system elapsed
   0.187   0.387   0.574

I decided I should compare with the result I got using round():


ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; 
proc.time()-ptm

user  system elapsed
   1.595   0.757   2.352

It is a curious fact that only a few of the values from 0 to 2000 disagree
between the two methods:


table( ints2[ ints2 != ints ] )


  1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112

I understand that it's all about the problem of representing digital
numbers in binary, but I still find some of the results a little
surprising, like that list of numbers from the table() output.  For
another example:


1000+3 - 1000*(1+3/1000)

[1] 1.136868e-13


3 - 1000*(0+3/1000)

[1] 0


2000+3 - 1000*(2+3/1000)

[1] 0

See what I mean?  So there is something special about the numbers around
1000.


I think it's really that there is something special about the numbers
near 1, and you're multiplying that by 1000.

Numbers from 1 to just below 2 are stored as their fractional part, with
52 bit precision.  Some intermediate calculations will store them with
64 bit precision.  52 bits gives about 15 or 16 decimal places.

If your number x is close to 3/1000, it is stored as the fractional part
of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
so that's why these values are accurate.

If your number x is close to 2.003, it is stored as the fractional part
of x/2, i.e. with errors like 1.0015 would have.  So I would have
guessed that 2.006 would have the same problems as 1.003, but I thought
you didn't see that.  So I tried it myself, and I do see that:


1000+3 - 1000*(1+3/1000)

[1] 1.136868e-13

2000+6 - 1000*(2+6/1000)

[1] 2.273737e-13

Reading more closely, I see that you didn't test this particular case,
so there's no contradiction here.

The one thing I couldn't think of an explanation for is why other
numbers between 1 and 2 don't have the same sorts of problems.  So I
tried the following:

# Set data to 1.000 thru 1.999
data - 1 + 0:999/1000

# Find the errors
errors - 1000 + 0:999 - 1000*data

# Plot them
plot(data, errors)

The plot doesn't show a uniform distribution, but much more uniform than 
yours:  so I think your data doesn't really cover all possible values 
from 0.000 to 1.999.  (I get a similar plot if I look at cases where 
ints != ints2 with 

Re: [R] rounding down with as.integer

2015-01-01 Thread Richard M. Heiberger
Interesting.  Following someone on this list today the goal is input
the data correctly.
My inclination would be to read the file as text, pad each number to
the right, drop the decimal point,
and then read it as an integer.
0 1 2 0.325 1.12 1.9
0.000 1.000 2.000 0.325 1.120 1.900
 1000 2000 0325 1120 1900

The pad step is the interesting step.

## 0 1 2 0.325 1.12 1.9
## 0.000 1.000 2.000 0.325 1.120 1.900
##  1000 2000 0325 1120 1900

x.in - scan(text=
0 1 2 0.325 1.12 1.9 1.
, what=)

padding - c(.000, 000, 00, 0, )

x.pad - paste(x.in, padding[nchar(x.in)], sep=)

x.nodot - sub(., , x.pad, fixed=TRUE)

x - as.integer(x.nodot)


Rich


On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:

 On 31/12/2014 8:44 PM, David Winsemius wrote:


 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but
 I just ran across this in something I was working on:

 as.integer(1000*1.003)

 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the
 machine precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)

 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)

 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )

 [1] 1003

 ...but this seems to always give the same answer and it is a little
 faster in my application:

 as.integer( 1000*1.003 + .1 )

 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no
 more than three digits to the right of the decimal.  I'm converting them to
 integers and saving them in binary format.


 So just add 0.0001 or even .001 to all of them and coerce to integer.


 I don't think the original problem was stated clearly, so I'm not sure
 whether this is a solution, but it looks wrong to me.  If you want to round
 to the nearest integer, why not use round() (without the as.integer
 afterwards)?  Or if you really do want an integer, why add 0.1 or 0.0001,
 why not add 0.5 before calling as.integer()?  This is the classical way to
 implement round().

 To state the problem clearly, I'd like to know what result is expected for
 any real number x.  Since R's numeric type only approximates the real
 numbers we might not be able to get a perfect match, but at least we could
 quantify how close we get.  Or is the input really character data?  The
 original post mentioned reading numbers from a text file.



 Maybe you'd like to know what I'm really doing.  I have 1600 text files each
 with up to 16,000 lines with 3100 numbers per line, delimited by a single
 space.  The numbers are between 0 and 2, inclusive, and they have up to
 three digits to the right of the decimal.  Every possible value in that
 range will occur in the data.  Some examples numbers: 0 1 2 0.325 1.12 1.9.
 I want to multiply by 1000 and store them as 16-bit integers (uint16).

 I've been reading in the data like so:

 data - scan( file=FILE, what=double(), nmax=3100*16000)


 At first I tried making the integers like so:

 ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm

user  system elapsed
   0.187   0.387   0.574

 I decided I should compare with the result I got using round():

 ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ;
 proc.time()-ptm

user  system elapsed
   1.595   0.757   2.352

 It is a curious fact that only a few of the values from 0 to 2000 disagree
 between the two methods:

 table( ints2[ ints2 != ints ] )


  1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112

 I understand that it's all about the problem of representing digital numbers
 in binary, but I still find some of the results a little surprising, like
 that list of numbers from the table() output.  For another example:

 1000+3 - 1000*(1+3/1000)

 [1] 1.136868e-13

 3 - 1000*(0+3/1000)

 [1] 0

 2000+3 - 1000*(2+3/1000)

 [1] 0

 See what I mean?  So there is something special about the numbers around
 1000.

 Back to the quesion at hand:  I can avoid use of round() and speed things up
 a little bit by just adding a small number after multiplying by 1000:

 ptm - proc.time() ; R3 - as.integer( 1000 * data + .1 ) ;
 proc.time()-ptm

user  system elapsed
   0.224   0.594   0.818

 You point out that adding .5 makes sense.  That is probably a better idea
 and I should take that approach under most conditions, but in this case we
 can add anything between 2e-13 and about 0.999 and always get the
 same answer.  We also have to remember that if a number might be negative
 (not a problem for me in this application), we need to subtract 0.5 instead
 of adding it.

 Anyway, right now this is what I'm actually doing:

 con - file( paste0(FILE, .uint16), wb )
 ptm - proc.time() ; writeBin( as.integer( 1000 * scan( file=FILE,
 

Re: [R] rounding down with as.integer

2015-01-01 Thread Mike Miller

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 31/12/2014 8:44 PM, David Winsemius wrote:


On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:


This is probably a FAQ, and I don't really have a question about it, but I just 
ran across this in something I was working on:


as.integer(1000*1.003)

[1] 1002

I didn't expect it, but maybe I should have.  I guess it's about the machine 
precision added to the fact that as.integer always rounds down:



as.integer(1000*1.003 + 255 * .Machine$double.eps)

[1] 1002


as.integer(1000*1.003 + 256 * .Machine$double.eps)

[1] 1003


This does it right...


as.integer( round( 1000*1.003 ) )

[1] 1003

...but this seems to always give the same answer and it is a little faster in 
my application:


as.integer( 1000*1.003 + .1 )

[1] 1003


FYI - I'm reading in a long vector of numbers from a text file with no more 
than three digits to the right of the decimal.  I'm converting them to integers 
and saving them in binary format.



So just add 0.0001 or even .001 to all of them and coerce to integer.


I don't think the original problem was stated clearly, so I'm not sure 
whether this is a solution, but it looks wrong to me.  If you want to 
round to the nearest integer, why not use round() (without the 
as.integer afterwards)?  Or if you really do want an integer, why add 
0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is the 
classical way to implement round().


To state the problem clearly, I'd like to know what result is expected 
for any real number x.  Since R's numeric type only approximates the 
real numbers we might not be able to get a perfect match, but at least 
we could quantify how close we get.  Or is the input really character 
data?  The original post mentioned reading numbers from a text file.



Maybe you'd like to know what I'm really doing.  I have 1600 text files 
each with up to 16,000 lines with 3100 numbers per line, delimited by a 
single space.  The numbers are between 0 and 2, inclusive, and they have 
up to three digits to the right of the decimal.  Every possible value in 
that range will occur in the data.  Some examples numbers: 0 1 2 0.325 
1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers 
(uint16).


I've been reading in the data like so:


data - scan( file=FILE, what=double(), nmax=3100*16000)


At first I tried making the integers like so:


ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm

   user  system elapsed
  0.187   0.387   0.574

I decided I should compare with the result I got using round():


ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; 
proc.time()-ptm

   user  system elapsed
  1.595   0.757   2.352

It is a curious fact that only a few of the values from 0 to 2000 disagree 
between the two methods:



table( ints2[ ints2 != ints ] )


 1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112

I understand that it's all about the problem of representing digital 
numbers in binary, but I still find some of the results a little 
surprising, like that list of numbers from the table() output.  For 
another example:



1000+3 - 1000*(1+3/1000)

[1] 1.136868e-13


3 - 1000*(0+3/1000)

[1] 0


2000+3 - 1000*(2+3/1000)

[1] 0

See what I mean?  So there is something special about the numbers around 
1000.


Back to the quesion at hand:  I can avoid use of round() and speed things 
up a little bit by just adding a small number after multiplying by 1000:



ptm - proc.time() ; R3 - as.integer( 1000 * data + .1 ) ; proc.time()-ptm

   user  system elapsed
  0.224   0.594   0.818

You point out that adding .5 makes sense.  That is probably a better idea 
and I should take that approach under most conditions, but in this case we 
can add anything between 2e-13 and about 0.999 and always get the 
same answer.  We also have to remember that if a number might be negative 
(not a problem for me in this application), we need to subtract 0.5 
instead of adding it.


Anyway, right now this is what I'm actually doing:


con - file( paste0(FILE, .uint16), wb )
ptm - proc.time() ; writeBin( as.integer( 1000 * scan( file=FILE, 
what=double(), nmax=3100*16000 ) + .1 ), con, size=2 ) ; proc.time()-ptm

Read 48013406 items
   user  system elapsed
 10.263   0.733  10.991

close(con)


By the way, writeBin() is something that I learned about here, from you, 
Duncan.  Thanks for that, too.


Mike

--
Michael B. Miller, Ph.D.
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4J

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-01 Thread Duncan Murdoch
On 31/12/2014 8:44 PM, David Winsemius wrote:
 
 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:
 
 This is probably a FAQ, and I don't really have a question about it, but I 
 just ran across this in something I was working on:

 as.integer(1000*1.003)
 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the machine 
 precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )
 [1] 1003

 ...but this seems to always give the same answer and it is a little faster 
 in my application:

 as.integer( 1000*1.003 + .1 )
 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no more 
 than three digits to the right of the decimal.  I'm converting them to 
 integers and saving them in binary format.

 
 So just add 0.0001 or even .001 to all of them and coerce to integer.

I don't think the original problem was stated clearly, so I'm not sure
whether this is a solution, but it looks wrong to me.  If you want to
round to the nearest integer, why not use round() (without the
as.integer afterwards)?  Or if you really do want an integer, why add
0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is the
classical way to implement round().

To state the problem clearly, I'd like to know what result is expected
for any real number x.  Since R's numeric type only approximates the
real numbers we might not be able to get a perfect match, but at least
we could quantify how close we get.  Or is the input really character
data?  The original post mentioned reading numbers from a text file.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-01 Thread Duncan Murdoch
On 01/01/2015 1:21 PM, Mike Miller wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:
 
 On 31/12/2014 8:44 PM, David Winsemius wrote:

 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but I 
 just ran across this in something I was working on:

 as.integer(1000*1.003)
 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the 
 machine precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )
 [1] 1003

 ...but this seems to always give the same answer and it is a little faster 
 in my application:

 as.integer( 1000*1.003 + .1 )
 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no 
 more than three digits to the right of the decimal.  I'm converting them 
 to integers and saving them in binary format.


 So just add 0.0001 or even .001 to all of them and coerce to integer.

 I don't think the original problem was stated clearly, so I'm not sure 
 whether this is a solution, but it looks wrong to me.  If you want to 
 round to the nearest integer, why not use round() (without the 
 as.integer afterwards)?  Or if you really do want an integer, why add 
 0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is the 
 classical way to implement round().

 To state the problem clearly, I'd like to know what result is expected 
 for any real number x.  Since R's numeric type only approximates the 
 real numbers we might not be able to get a perfect match, but at least 
 we could quantify how close we get.  Or is the input really character 
 data?  The original post mentioned reading numbers from a text file.
 
 
 Maybe you'd like to know what I'm really doing.  I have 1600 text files 
 each with up to 16,000 lines with 3100 numbers per line, delimited by a 
 single space.  The numbers are between 0 and 2, inclusive, and they have 
 up to three digits to the right of the decimal.  Every possible value in 
 that range will occur in the data.  Some examples numbers: 0 1 2 0.325 
 1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers 
 (uint16).
 
 I've been reading in the data like so:
 
 data - scan( file=FILE, what=double(), nmax=3100*16000)
 
 At first I tried making the integers like so:
 
 ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm
 user  system elapsed
0.187   0.387   0.574
 
 I decided I should compare with the result I got using round():
 
 ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; 
 proc.time()-ptm
 user  system elapsed
1.595   0.757   2.352
 
 It is a curious fact that only a few of the values from 0 to 2000 disagree 
 between the two methods:
 
 table( ints2[ ints2 != ints ] )
 
   1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112
 
 I understand that it's all about the problem of representing digital 
 numbers in binary, but I still find some of the results a little 
 surprising, like that list of numbers from the table() output.  For 
 another example:
 
 1000+3 - 1000*(1+3/1000)
 [1] 1.136868e-13
 
 3 - 1000*(0+3/1000)
 [1] 0
 
 2000+3 - 1000*(2+3/1000)
 [1] 0
 
 See what I mean?  So there is something special about the numbers around 
 1000.

I think it's really that there is something special about the numbers
near 1, and you're multiplying that by 1000.

Numbers from 1 to just below 2 are stored as their fractional part, with
52 bit precision.  Some intermediate calculations will store them with
64 bit precision.  52 bits gives about 15 or 16 decimal places.

If your number x is close to 3/1000, it is stored as the fractional part
of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
so that's why these values are accurate.

If your number x is close to 2.003, it is stored as the fractional part
of x/2, i.e. with errors like 1.0015 would have.  So I would have
guessed that 2.006 would have the same problems as 1.003, but I thought
you didn't see that.  So I tried it myself, and I do see that:

 1000+3 - 1000*(1+3/1000)
[1] 1.136868e-13
 2000+6 - 1000*(2+6/1000)
[1] 2.273737e-13

Reading more closely, I see that you didn't test this particular case,
so there's no contradiction here.

The one thing I couldn't think of an explanation for is why other
numbers between 1 and 2 don't have the same sorts of problems.  So I
tried the following:

# Set data to 1.000 thru 1.999
data - 1 + 0:999/1000

# Find the errors
errors - 1000 + 0:999 - 1000*data

# Plot them
plot(data, errors)

The plot doesn't show a uniform distribution, but much more uniform than
yours:  so I think your data doesn't really cover all possible values
from 0.000 to 1.999.  (I get a similar plot if I look at cases 

Re: [R] rounding down with as.integer

2015-01-01 Thread Mike Miller

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 01/01/2015 1:21 PM, Mike Miller wrote:

I understand that it's all about the problem of representing digital 
numbers in binary, but I still find some of the results a little 
surprising, like that list of numbers from the table() output.  For 
another example:



1000+3 - 1000*(1+3/1000)

[1] 1.136868e-13


3 - 1000*(0+3/1000)

[1] 0


2000+3 - 1000*(2+3/1000)

[1] 0

See what I mean?  So there is something special about the numbers 
around 1000.


I think it's really that there is something special about the numbers 
near 1, and you're multiplying that by 1000.


Numbers from 1 to just below 2 are stored as their fractional part, with 
52 bit precision.  Some intermediate calculations will store them with 
64 bit precision.  52 bits gives about 15 or 16 decimal places.



This is how big those errors are:


512*.Machine$double.eps

[1] 1.136868e-13

Under other conditions you also were seeing errors of twice that, or 
1024*.Machine$double.eps.  It might not be a coincidence that the largest 
number giving me an error was 1023.



2^-43

[1] 1.136868e-13


.Machine$double.eps

[1] 2.220446e-16


2^-52

[1] 2.220446e-16

I guess the 52 comes from the IEEE floating point spec...

http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64

...but why are we seeing errors so much bigger than the machine precision? 
Why does it change at 2?


It doesn't really matter to my work, but it is a curious thing, so I would 
be interested to learn about it.


Mike

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2015-01-01 Thread Mike Miller
I'd have to say thanks, but no thanks, to that one!  ;-)  The problem is 
that it will take a long time and it will give the same answer.


The first time I did this kind of thing, a year or two ago, I manipulated 
the text data to produce integers before putting the data into R.  The 
data were a little different -- already zero padded with three digits to 
the right of the decimal and one to the left, so all I had to do was drop 
the decimal point.  The as.integer(1000*x+.5) method is very fast and it 
works great.


I could have done that this time, but I was also saving to other formats, 
so I had the data already in the format I described.


Mike


On Thu, 1 Jan 2015, Richard M. Heiberger wrote:


Interesting.  Following someone on this list today the goal is input
the data correctly.
My inclination would be to read the file as text, pad each number to
the right, drop the decimal point,
and then read it as an integer.
0 1 2 0.325 1.12 1.9
0.000 1.000 2.000 0.325 1.120 1.900
 1000 2000 0325 1120 1900

The pad step is the interesting step.

## 0 1 2 0.325 1.12 1.9
## 0.000 1.000 2.000 0.325 1.120 1.900
##  1000 2000 0325 1120 1900

x.in - scan(text=
0 1 2 0.325 1.12 1.9 1.
, what=)

padding - c(.000, 000, 00, 0, )

x.pad - paste(x.in, padding[nchar(x.in)], sep=)

x.nodot - sub(., , x.pad, fixed=TRUE)

x - as.integer(x.nodot)


Rich


On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote:

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 31/12/2014 8:44 PM, David Winsemius wrote:



On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:


This is probably a FAQ, and I don't really have a question about it, but
I just ran across this in something I was working on:


as.integer(1000*1.003)


[1] 1002

I didn't expect it, but maybe I should have.  I guess it's about the
machine precision added to the fact that as.integer always rounds down:



as.integer(1000*1.003 + 255 * .Machine$double.eps)


[1] 1002


as.integer(1000*1.003 + 256 * .Machine$double.eps)


[1] 1003


This does it right...


as.integer( round( 1000*1.003 ) )


[1] 1003

...but this seems to always give the same answer and it is a little
faster in my application:


as.integer( 1000*1.003 + .1 )


[1] 1003


FYI - I'm reading in a long vector of numbers from a text file with no
more than three digits to the right of the decimal.  I'm converting them to
integers and saving them in binary format.



So just add 0.0001 or even .001 to all of them and coerce to integer.



I don't think the original problem was stated clearly, so I'm not sure
whether this is a solution, but it looks wrong to me.  If you want to round
to the nearest integer, why not use round() (without the as.integer
afterwards)?  Or if you really do want an integer, why add 0.1 or 0.0001,
why not add 0.5 before calling as.integer()?  This is the classical way to
implement round().

To state the problem clearly, I'd like to know what result is expected for
any real number x.  Since R's numeric type only approximates the real
numbers we might not be able to get a perfect match, but at least we could
quantify how close we get.  Or is the input really character data?  The
original post mentioned reading numbers from a text file.




Maybe you'd like to know what I'm really doing.  I have 1600 text files each
with up to 16,000 lines with 3100 numbers per line, delimited by a single
space.  The numbers are between 0 and 2, inclusive, and they have up to
three digits to the right of the decimal.  Every possible value in that
range will occur in the data.  Some examples numbers: 0 1 2 0.325 1.12 1.9.
I want to multiply by 1000 and store them as 16-bit integers (uint16).

I've been reading in the data like so:


data - scan( file=FILE, what=double(), nmax=3100*16000)



At first I tried making the integers like so:


ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm


   user  system elapsed
  0.187   0.387   0.574

I decided I should compare with the result I got using round():


ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ;
proc.time()-ptm


   user  system elapsed
  1.595   0.757   2.352

It is a curious fact that only a few of the values from 0 to 2000 disagree
between the two methods:


table( ints2[ ints2 != ints ] )



 1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112

I understand that it's all about the problem of representing digital numbers
in binary, but I still find some of the results a little surprising, like
that list of numbers from the table() output.  For another example:


1000+3 - 1000*(1+3/1000)


[1] 1.136868e-13


3 - 1000*(0+3/1000)


[1] 0


2000+3 - 1000*(2+3/1000)


[1] 0

See what I mean?  So there is something special about the numbers around
1000.

Back to the quesion at hand:  I can avoid use of round() and speed things up
a little bit by just adding a small number after multiplying by 1000:


ptm - 

Re: [R] rounding down with as.integer

2015-01-01 Thread Ted Harding
I've been followeing this little tour round the murkier bistros
in the back-streets of R with interest! Then it occurred to me:
What is wrong with [using example data]:

  x0 - c(0,1,2,0.325,1.12,1.9,1.003)
  x1 - as.integer(as.character(1000*x0))
  n1 - c(0,1000,2000,325,1120,1900,1003)

  x1 - n1
  ## [1] 0 0 0 0 0 0 0

  ## But, of course:
  1000*x0 - n1
  ## [1]  0.00e+00  0.00e+00  0.00e+00  0.00e+00
  ## [5]  0.00e+00  0.00e+00 -1.136868e-13

Or am I missing somthing else in what Mike Miller is seeking to do?
Ted.

On 01-Jan-2015 19:58:02 Mike Miller wrote:
 I'd have to say thanks, but no thanks, to that one!  ;-)  The problem is 
 that it will take a long time and it will give the same answer.
 
 The first time I did this kind of thing, a year or two ago, I manipulated 
 the text data to produce integers before putting the data into R.  The 
 data were a little different -- already zero padded with three digits to 
 the right of the decimal and one to the left, so all I had to do was drop 
 the decimal point.  The as.integer(1000*x+.5) method is very fast and it 
 works great.
 
 I could have done that this time, but I was also saving to other formats, 
 so I had the data already in the format I described.
 
 Mike
 
 
 On Thu, 1 Jan 2015, Richard M. Heiberger wrote:
 
 Interesting.  Following someone on this list today the goal is input
 the data correctly.
 My inclination would be to read the file as text, pad each number to
 the right, drop the decimal point,
 and then read it as an integer.
 0 1 2 0.325 1.12 1.9
 0.000 1.000 2.000 0.325 1.120 1.900
  1000 2000 0325 1120 1900

 The pad step is the interesting step.

 ## 0 1 2 0.325 1.12 1.9
 ## 0.000 1.000 2.000 0.325 1.120 1.900
 ##  1000 2000 0325 1120 1900

 x.in - scan(text=
 0 1 2 0.325 1.12 1.9 1.
 , what=)

 padding - c(.000, 000, 00, 0, )

 x.pad - paste(x.in, padding[nchar(x.in)], sep=)

 x.nodot - sub(., , x.pad, fixed=TRUE)

 x - as.integer(x.nodot)


 Rich


 On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote:
 On Thu, 1 Jan 2015, Duncan Murdoch wrote:

 On 31/12/2014 8:44 PM, David Winsemius wrote:


 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but
 I just ran across this in something I was working on:

 as.integer(1000*1.003)

 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the
 machine precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)

 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)

 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )

 [1] 1003

 ...but this seems to always give the same answer and it is a little
 faster in my application:

 as.integer( 1000*1.003 + .1 )

 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no
 more than three digits to the right of the decimal.  I'm converting them
 to
 integers and saving them in binary format.


 So just add 0.0001 or even .001 to all of them and coerce to integer.


 I don't think the original problem was stated clearly, so I'm not sure
 whether this is a solution, but it looks wrong to me.  If you want to
 round
 to the nearest integer, why not use round() (without the as.integer
 afterwards)?  Or if you really do want an integer, why add 0.1 or 0.0001,
 why not add 0.5 before calling as.integer()?  This is the classical way to
 implement round().

 To state the problem clearly, I'd like to know what result is expected for
 any real number x.  Since R's numeric type only approximates the real
 numbers we might not be able to get a perfect match, but at least we could
 quantify how close we get.  Or is the input really character data?  The
 original post mentioned reading numbers from a text file.



 Maybe you'd like to know what I'm really doing.  I have 1600 text files
 each
 with up to 16,000 lines with 3100 numbers per line, delimited by a single
 space.  The numbers are between 0 and 2, inclusive, and they have up to
 three digits to the right of the decimal.  Every possible value in that
 range will occur in the data.  Some examples numbers: 0 1 2 0.325 1.12 1.9.
 I want to multiply by 1000 and store them as 16-bit integers (uint16).

 I've been reading in the data like so:

 data - scan( file=FILE, what=double(), nmax=3100*16000)


 At first I tried making the integers like so:

 ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm

user  system elapsed
   0.187   0.387   0.574

 I decided I should compare with the result I got using round():

 ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ;
 proc.time()-ptm

user  system elapsed
   1.595   0.757   2.352

 It is a curious fact that only a few of the values from 0 to 2000 disagree
 between the two methods:

 table( ints2[ ints2 != ints ] )


  1001  1003  1005  1007  1009  1011  1013  1015  

Re: [R] rounding down with as.integer

2015-01-01 Thread Mike Miller

Yes, Ted, that also works, but it's very slow:

# read in values:

data - scan( file=RECIP_IN, what=double(), nmax=recip_N*16000)

Read 48013406 items

# convert to integer by adding .5 and rounding down:

ptm - proc.time() ; ints - as.integer( 1000 * data + .5 ) ; proc.time()-ptm

   user  system elapsed
  0.221   1.008   1.227

# convert to character, then to integer:

ptm - proc.time() ; ints2 - as.integer( as.character( 1000 * data ) ) ; 
proc.time()-ptm

   user  system elapsed
 32.110   0.485  32.578

# the results are the same:

identical(ints,ints2)

[1] TRUE

So they give the same answer, but converting to character takes about 25 
times longer.


Mike


On Thu, 1 Jan 2015, ted.hard...@wlandres.net wrote:


I've been followeing this little tour round the murkier bistros
in the back-streets of R with interest! Then it occurred to me:
What is wrong with [using example data]:

 x0 - c(0,1,2,0.325,1.12,1.9,1.003)
 x1 - as.integer(as.character(1000*x0))
 n1 - c(0,1000,2000,325,1120,1900,1003)

 x1 - n1
 ## [1] 0 0 0 0 0 0 0

 ## But, of course:
 1000*x0 - n1
 ## [1]  0.00e+00  0.00e+00  0.00e+00  0.00e+00
 ## [5]  0.00e+00  0.00e+00 -1.136868e-13

Or am I missing somthing else in what Mike Miller is seeking to do?
Ted.

On 01-Jan-2015 19:58:02 Mike Miller wrote:

I'd have to say thanks, but no thanks, to that one!  ;-)  The problem is
that it will take a long time and it will give the same answer.

The first time I did this kind of thing, a year or two ago, I manipulated
the text data to produce integers before putting the data into R.  The
data were a little different -- already zero padded with three digits to
the right of the decimal and one to the left, so all I had to do was drop
the decimal point.  The as.integer(1000*x+.5) method is very fast and it
works great.

I could have done that this time, but I was also saving to other formats,
so I had the data already in the format I described.

Mike


On Thu, 1 Jan 2015, Richard M. Heiberger wrote:


Interesting.  Following someone on this list today the goal is input
the data correctly.
My inclination would be to read the file as text, pad each number to
the right, drop the decimal point,
and then read it as an integer.
0 1 2 0.325 1.12 1.9
0.000 1.000 2.000 0.325 1.120 1.900
 1000 2000 0325 1120 1900

The pad step is the interesting step.

## 0 1 2 0.325 1.12 1.9
## 0.000 1.000 2.000 0.325 1.120 1.900
##  1000 2000 0325 1120 1900

x.in - scan(text=
0 1 2 0.325 1.12 1.9 1.
, what=)

padding - c(.000, 000, 00, 0, )

x.pad - paste(x.in, padding[nchar(x.in)], sep=)

x.nodot - sub(., , x.pad, fixed=TRUE)

x - as.integer(x.nodot)


Rich


On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote:

On Thu, 1 Jan 2015, Duncan Murdoch wrote:


On 31/12/2014 8:44 PM, David Winsemius wrote:



On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:


This is probably a FAQ, and I don't really have a question about it, but
I just ran across this in something I was working on:


as.integer(1000*1.003)


[1] 1002

I didn't expect it, but maybe I should have.  I guess it's about the
machine precision added to the fact that as.integer always rounds down:



as.integer(1000*1.003 + 255 * .Machine$double.eps)


[1] 1002


as.integer(1000*1.003 + 256 * .Machine$double.eps)


[1] 1003


This does it right...


as.integer( round( 1000*1.003 ) )


[1] 1003

...but this seems to always give the same answer and it is a little
faster in my application:


as.integer( 1000*1.003 + .1 )


[1] 1003


FYI - I'm reading in a long vector of numbers from a text file with no
more than three digits to the right of the decimal.  I'm converting them
to
integers and saving them in binary format.



So just add 0.0001 or even .001 to all of them and coerce to integer.



I don't think the original problem was stated clearly, so I'm not sure
whether this is a solution, but it looks wrong to me.  If you want to
round
to the nearest integer, why not use round() (without the as.integer
afterwards)?  Or if you really do want an integer, why add 0.1 or 0.0001,
why not add 0.5 before calling as.integer()?  This is the classical way to
implement round().

To state the problem clearly, I'd like to know what result is expected for
any real number x.  Since R's numeric type only approximates the real
numbers we might not be able to get a perfect match, but at least we could
quantify how close we get.  Or is the input really character data?  The
original post mentioned reading numbers from a text file.




Maybe you'd like to know what I'm really doing.  I have 1600 text files
each
with up to 16,000 lines with 3100 numbers per line, delimited by a single
space.  The numbers are between 0 and 2, inclusive, and they have up to
three digits to the right of the decimal.  Every possible value in that
range will occur in the data.  Some examples numbers: 0 1 2 0.325 1.12 1.9.
I want to multiply by 1000 and store them as 16-bit integers (uint16).

I've been 

[R] rounding down with as.integer

2014-12-31 Thread Mike Miller
This is probably a FAQ, and I don't really have a question about it, but I 
just ran across this in something I was working on:



as.integer(1000*1.003)

[1] 1002

I didn't expect it, but maybe I should have.  I guess it's about the 
machine precision added to the fact that as.integer always rounds down:




as.integer(1000*1.003 + 255 * .Machine$double.eps)

[1] 1002


as.integer(1000*1.003 + 256 * .Machine$double.eps)

[1] 1003


This does it right...


as.integer( round( 1000*1.003 ) )

[1] 1003

...but this seems to always give the same answer and it is a little faster 
in my application:



as.integer( 1000*1.003 + .1 )

[1] 1003


FYI - I'm reading in a long vector of numbers from a text file with no 
more than three digits to the right of the decimal.  I'm converting them 
to integers and saving them in binary format.


Best,
Mike

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2014-12-31 Thread David Winsemius

On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but I 
 just ran across this in something I was working on:
 
 as.integer(1000*1.003)
 [1] 1002
 
 I didn't expect it, but maybe I should have.  I guess it's about the machine 
 precision added to the fact that as.integer always rounds down:
 
 
 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002
 
 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003
 
 
 This does it right...
 
 as.integer( round( 1000*1.003 ) )
 [1] 1003
 
 ...but this seems to always give the same answer and it is a little faster in 
 my application:
 
 as.integer( 1000*1.003 + .1 )
 [1] 1003
 
 
 FYI - I'm reading in a long vector of numbers from a text file with no more 
 than three digits to the right of the decimal.  I'm converting them to 
 integers and saving them in binary format.
 

So just add 0.0001 or even .001 to all of them and coerce to integer.


 Best,
 Mike
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2014-12-31 Thread Zaid Bhatti
How can I unsubscribe to not receive loop e mails?

Sent from my Huawei Mobile

David Winsemius dwinsem...@comcast.net wrote:


On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:

 This is probably a FAQ, and I don't really have a question about it, but I 
 just ran across this in something I was working on:

 as.integer(1000*1.003)
 [1] 1002

 I didn't expect it, but maybe I should have.  I guess it's about the machine 
 precision added to the fact that as.integer always rounds down:


 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002

 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003


 This does it right...

 as.integer( round( 1000*1.003 ) )
 [1] 1003

 ...but this seems to always give the same answer and it is a little faster in 
 my application:

 as.integer( 1000*1.003 + .1 )
 [1] 1003


 FYI - I'm reading in a long vector of numbers from a text file with no more 
 than three digits to the right of the decimal.  I'm converting them to 
 integers and saving them in binary format.


So just add 0.0001 or even .001 to all of them and coerce to integer.


 Best,
 Mike

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



This e-mail may contain information that is privileged or confidential. If you 
are not the intended recipient, please delete the e-mail and any attachments 
and notify us immediately.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding down with as.integer

2014-12-31 Thread David Winsemius
Read the message at the bottom of every message from rhelp.

-- 
David.

On Dec 31, 2014, at 8:09 PM, Zaid Bhatti wrote:

 How can I unsubscribe to not receive loop e mails?
 
 Sent from my Huawei Mobile
 
 David Winsemius dwinsem...@comcast.net wrote:
 
 
 On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:
 
 This is probably a FAQ, and I don't really have a question about it, but I 
 just ran across this in something I was working on:
 
 as.integer(1000*1.003)
 [1] 1002
 
 I didn't expect it, but maybe I should have.  I guess it's about the machine 
 precision added to the fact that as.integer always rounds down:
 
 
 as.integer(1000*1.003 + 255 * .Machine$double.eps)
 [1] 1002
 
 as.integer(1000*1.003 + 256 * .Machine$double.eps)
 [1] 1003
 
 
 This does it right...
 
 as.integer( round( 1000*1.003 ) )
 [1] 1003
 
 ...but this seems to always give the same answer and it is a little faster 
 in my application:
 
 as.integer( 1000*1.003 + .1 )
 [1] 1003
 
 
 FYI - I'm reading in a long vector of numbers from a text file with no more 
 than three digits to the right of the decimal.  I'm converting them to 
 integers and saving them in binary format.
 
 
 So just add 0.0001 or even .001 to all of them and coerce to integer.
 
 
 Best,
 Mike
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 This e-mail may contain information that is privileged or confidential. If 
 you are not the intended recipient, please delete the e-mail and any 
 attachments and notify us immediately.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.