Re: [R] rounding down with as.integer
On 01/01/2015 10:05 PM, Mike Miller wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 01/01/2015 1:21 PM, Mike Miller wrote: I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. I think it's really that there is something special about the numbers near 1, and you're multiplying that by 1000. Numbers from 1 to just below 2 are stored as their fractional part, with 52 bit precision. Some intermediate calculations will store them with 64 bit precision. 52 bits gives about 15 or 16 decimal places. This is how big those errors are: 512*.Machine$double.eps [1] 1.136868e-13 Under other conditions you also were seeing errors of twice that, or 1024*.Machine$double.eps. It might not be a coincidence that the largest number giving me an error was 1023. 2^-43 [1] 1.136868e-13 .Machine$double.eps [1] 2.220446e-16 2^-52 [1] 2.220446e-16 I guess the 52 comes from the IEEE floating point spec... http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64 ...but why are we seeing errors so much bigger than the machine precision? You are multiplying by 1000. That magnifies the error. Why does it change at 2? Because (most) floating point numbers are stored as (1 + x) * 2^y, where x is a number between 0 and 1, and y is an integer value between -1023 and 1023. The value of y changes at 2, and this means errors in x become twice as big. (The exceptions are 0, Inf, NaN, etc., as well as denormals, where y is -1024 and the format changes to x * 2^(y+1).) Duncan Murdoch It doesn't really matter to my work, but it is a curious thing, so I would be interested to learn about it. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On 02 Jan 2015, at 04:05 , Mike Miller mbmille...@gmail.com wrote: ...but why are we seeing errors so much bigger than the machine precision? Why does it change at 2? Because relative errors in the one-thousands part are roughly a thousand times bigger than errors in the number itself? Put differently: the 1.136868e-13 error is an absolute error on 1003, not on 3 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On Fri, 2 Jan 2015, Duncan Murdoch wrote: On 01/01/2015 10:05 PM, Mike Miller wrote: This is how big those errors are: 512*.Machine$double.eps [1] 1.136868e-13 Under other conditions you also were seeing errors of twice that, or 1024*.Machine$double.eps. It might not be a coincidence that the largest number giving me an error was 1023. 2^-43 [1] 1.136868e-13 .Machine$double.eps [1] 2.220446e-16 2^-52 [1] 2.220446e-16 I guess the 52 comes from the IEEE floating point spec... http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64 ...but why are we seeing errors so much bigger than the machine precision? You are multiplying by 1000. That magnifies the error. Why does it change at 2? Because (most) floating point numbers are stored as (1 + x) * 2^y, where x is a number between 0 and 1, and y is an integer value between -1023 and 1023. The value of y changes at 2, and this means errors in x become twice as big. (The exceptions are 0, Inf, NaN, etc., as well as denormals, where y is -1024 and the format changes to x * 2^(y+1).) That is a great explanation. Thanks very much! Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On 01/01/2015 2:43 PM, Mike Miller wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 01/01/2015 1:21 PM, Mike Miller wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. I think it's really that there is something special about the numbers near 1, and you're multiplying that by 1000. Numbers from 1 to just below 2 are stored as their fractional part, with 52 bit precision. Some intermediate calculations will store them with 64 bit precision. 52 bits gives about 15 or 16 decimal places. If your number x is close to 3/1000, it is stored as the fractional part of 2^9 * x. This gives it an extra 2 or 3 decimal digits of precision, so that's why these values are accurate. If your number x is close to 2.003, it is stored as the fractional part of x/2, i.e. with errors like 1.0015 would have. So I would have guessed that 2.006 would have the same problems as 1.003, but I thought you didn't see that. So I tried it myself, and I do see that: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 2000+6 - 1000*(2+6/1000) [1] 2.273737e-13 Reading more closely, I see that you didn't test this particular case, so there's no contradiction here. The one thing I couldn't think of an explanation for is why other numbers between 1 and 2 don't have the same sorts of problems. So I tried the following: # Set data to 1.000 thru 1.999 data - 1 + 0:999/1000 # Find the errors errors - 1000 + 0:999 - 1000*data # Plot them plot(data, errors) The plot doesn't show a uniform distribution, but much more uniform than yours: so I think your data doesn't really cover all
Re: [R] rounding down with as.integer
On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 01/01/2015 1:21 PM, Mike Miller wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. I think it's really that there is something special about the numbers near 1, and you're multiplying that by 1000. Numbers from 1 to just below 2 are stored as their fractional part, with 52 bit precision. Some intermediate calculations will store them with 64 bit precision. 52 bits gives about 15 or 16 decimal places. If your number x is close to 3/1000, it is stored as the fractional part of 2^9 * x. This gives it an extra 2 or 3 decimal digits of precision, so that's why these values are accurate. If your number x is close to 2.003, it is stored as the fractional part of x/2, i.e. with errors like 1.0015 would have. So I would have guessed that 2.006 would have the same problems as 1.003, but I thought you didn't see that. So I tried it myself, and I do see that: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 2000+6 - 1000*(2+6/1000) [1] 2.273737e-13 Reading more closely, I see that you didn't test this particular case, so there's no contradiction here. The one thing I couldn't think of an explanation for is why other numbers between 1 and 2 don't have the same sorts of problems. So I tried the following: # Set data to 1.000 thru 1.999 data - 1 + 0:999/1000 # Find the errors errors - 1000 + 0:999 - 1000*data # Plot them plot(data, errors) The plot doesn't show a uniform distribution, but much more uniform than yours: so I think your data doesn't really cover all possible values from 0.000 to 1.999. (I get a similar plot if I look at cases where ints != ints2 with
Re: [R] rounding down with as.integer
Interesting. Following someone on this list today the goal is input the data correctly. My inclination would be to read the file as text, pad each number to the right, drop the decimal point, and then read it as an integer. 0 1 2 0.325 1.12 1.9 0.000 1.000 2.000 0.325 1.120 1.900 1000 2000 0325 1120 1900 The pad step is the interesting step. ## 0 1 2 0.325 1.12 1.9 ## 0.000 1.000 2.000 0.325 1.120 1.900 ## 1000 2000 0325 1120 1900 x.in - scan(text= 0 1 2 0.325 1.12 1.9 1. , what=) padding - c(.000, 000, 00, 0, ) x.pad - paste(x.in, padding[nchar(x.in)], sep=) x.nodot - sub(., , x.pad, fixed=TRUE) x - as.integer(x.nodot) Rich On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. Back to the quesion at hand: I can avoid use of round() and speed things up a little bit by just adding a small number after multiplying by 1000: ptm - proc.time() ; R3 - as.integer( 1000 * data + .1 ) ; proc.time()-ptm user system elapsed 0.224 0.594 0.818 You point out that adding .5 makes sense. That is probably a better idea and I should take that approach under most conditions, but in this case we can add anything between 2e-13 and about 0.999 and always get the same answer. We also have to remember that if a number might be negative (not a problem for me in this application), we need to subtract 0.5 instead of adding it. Anyway, right now this is what I'm actually doing: con - file( paste0(FILE, .uint16), wb ) ptm - proc.time() ; writeBin( as.integer( 1000 * scan( file=FILE,
Re: [R] rounding down with as.integer
On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. Back to the quesion at hand: I can avoid use of round() and speed things up a little bit by just adding a small number after multiplying by 1000: ptm - proc.time() ; R3 - as.integer( 1000 * data + .1 ) ; proc.time()-ptm user system elapsed 0.224 0.594 0.818 You point out that adding .5 makes sense. That is probably a better idea and I should take that approach under most conditions, but in this case we can add anything between 2e-13 and about 0.999 and always get the same answer. We also have to remember that if a number might be negative (not a problem for me in this application), we need to subtract 0.5 instead of adding it. Anyway, right now this is what I'm actually doing: con - file( paste0(FILE, .uint16), wb ) ptm - proc.time() ; writeBin( as.integer( 1000 * scan( file=FILE, what=double(), nmax=3100*16000 ) + .1 ), con, size=2 ) ; proc.time()-ptm Read 48013406 items user system elapsed 10.263 0.733 10.991 close(con) By the way, writeBin() is something that I learned about here, from you, Duncan. Thanks for that, too. Mike -- Michael B. Miller, Ph.D. University of Minnesota http://scholar.google.com/citations?user=EV_phq4J __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On 01/01/2015 1:21 PM, Mike Miller wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. I think it's really that there is something special about the numbers near 1, and you're multiplying that by 1000. Numbers from 1 to just below 2 are stored as their fractional part, with 52 bit precision. Some intermediate calculations will store them with 64 bit precision. 52 bits gives about 15 or 16 decimal places. If your number x is close to 3/1000, it is stored as the fractional part of 2^9 * x. This gives it an extra 2 or 3 decimal digits of precision, so that's why these values are accurate. If your number x is close to 2.003, it is stored as the fractional part of x/2, i.e. with errors like 1.0015 would have. So I would have guessed that 2.006 would have the same problems as 1.003, but I thought you didn't see that. So I tried it myself, and I do see that: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 2000+6 - 1000*(2+6/1000) [1] 2.273737e-13 Reading more closely, I see that you didn't test this particular case, so there's no contradiction here. The one thing I couldn't think of an explanation for is why other numbers between 1 and 2 don't have the same sorts of problems. So I tried the following: # Set data to 1.000 thru 1.999 data - 1 + 0:999/1000 # Find the errors errors - 1000 + 0:999 - 1000*data # Plot them plot(data, errors) The plot doesn't show a uniform distribution, but much more uniform than yours: so I think your data doesn't really cover all possible values from 0.000 to 1.999. (I get a similar plot if I look at cases
Re: [R] rounding down with as.integer
On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 01/01/2015 1:21 PM, Mike Miller wrote: I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. I think it's really that there is something special about the numbers near 1, and you're multiplying that by 1000. Numbers from 1 to just below 2 are stored as their fractional part, with 52 bit precision. Some intermediate calculations will store them with 64 bit precision. 52 bits gives about 15 or 16 decimal places. This is how big those errors are: 512*.Machine$double.eps [1] 1.136868e-13 Under other conditions you also were seeing errors of twice that, or 1024*.Machine$double.eps. It might not be a coincidence that the largest number giving me an error was 1023. 2^-43 [1] 1.136868e-13 .Machine$double.eps [1] 2.220446e-16 2^-52 [1] 2.220446e-16 I guess the 52 comes from the IEEE floating point spec... http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64 ...but why are we seeing errors so much bigger than the machine precision? Why does it change at 2? It doesn't really matter to my work, but it is a curious thing, so I would be interested to learn about it. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
I'd have to say thanks, but no thanks, to that one! ;-) The problem is that it will take a long time and it will give the same answer. The first time I did this kind of thing, a year or two ago, I manipulated the text data to produce integers before putting the data into R. The data were a little different -- already zero padded with three digits to the right of the decimal and one to the left, so all I had to do was drop the decimal point. The as.integer(1000*x+.5) method is very fast and it works great. I could have done that this time, but I was also saving to other formats, so I had the data already in the format I described. Mike On Thu, 1 Jan 2015, Richard M. Heiberger wrote: Interesting. Following someone on this list today the goal is input the data correctly. My inclination would be to read the file as text, pad each number to the right, drop the decimal point, and then read it as an integer. 0 1 2 0.325 1.12 1.9 0.000 1.000 2.000 0.325 1.120 1.900 1000 2000 0325 1120 1900 The pad step is the interesting step. ## 0 1 2 0.325 1.12 1.9 ## 0.000 1.000 2.000 0.325 1.120 1.900 ## 1000 2000 0325 1120 1900 x.in - scan(text= 0 1 2 0.325 1.12 1.9 1. , what=) padding - c(.000, 000, 00, 0, ) x.pad - paste(x.in, padding[nchar(x.in)], sep=) x.nodot - sub(., , x.pad, fixed=TRUE) x - as.integer(x.nodot) Rich On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015 1017 1019 1021 1023 35651 27020 15993 11505 8967 7549 6885 6064 5512 4828 4533 4112 I understand that it's all about the problem of representing digital numbers in binary, but I still find some of the results a little surprising, like that list of numbers from the table() output. For another example: 1000+3 - 1000*(1+3/1000) [1] 1.136868e-13 3 - 1000*(0+3/1000) [1] 0 2000+3 - 1000*(2+3/1000) [1] 0 See what I mean? So there is something special about the numbers around 1000. Back to the quesion at hand: I can avoid use of round() and speed things up a little bit by just adding a small number after multiplying by 1000: ptm -
Re: [R] rounding down with as.integer
I've been followeing this little tour round the murkier bistros in the back-streets of R with interest! Then it occurred to me: What is wrong with [using example data]: x0 - c(0,1,2,0.325,1.12,1.9,1.003) x1 - as.integer(as.character(1000*x0)) n1 - c(0,1000,2000,325,1120,1900,1003) x1 - n1 ## [1] 0 0 0 0 0 0 0 ## But, of course: 1000*x0 - n1 ## [1] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 ## [5] 0.00e+00 0.00e+00 -1.136868e-13 Or am I missing somthing else in what Mike Miller is seeking to do? Ted. On 01-Jan-2015 19:58:02 Mike Miller wrote: I'd have to say thanks, but no thanks, to that one! ;-) The problem is that it will take a long time and it will give the same answer. The first time I did this kind of thing, a year or two ago, I manipulated the text data to produce integers before putting the data into R. The data were a little different -- already zero padded with three digits to the right of the decimal and one to the left, so all I had to do was drop the decimal point. The as.integer(1000*x+.5) method is very fast and it works great. I could have done that this time, but I was also saving to other formats, so I had the data already in the format I described. Mike On Thu, 1 Jan 2015, Richard M. Heiberger wrote: Interesting. Following someone on this list today the goal is input the data correctly. My inclination would be to read the file as text, pad each number to the right, drop the decimal point, and then read it as an integer. 0 1 2 0.325 1.12 1.9 0.000 1.000 2.000 0.325 1.120 1.900 1000 2000 0325 1120 1900 The pad step is the interesting step. ## 0 1 2 0.325 1.12 1.9 ## 0.000 1.000 2.000 0.325 1.120 1.900 ## 1000 2000 0325 1120 1900 x.in - scan(text= 0 1 2 0.325 1.12 1.9 1. , what=) padding - c(.000, 000, 00, 0, ) x.pad - paste(x.in, padding[nchar(x.in)], sep=) x.nodot - sub(., , x.pad, fixed=TRUE) x - as.integer(x.nodot) Rich On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been reading in the data like so: data - scan( file=FILE, what=double(), nmax=3100*16000) At first I tried making the integers like so: ptm - proc.time() ; ints - as.integer( 1000 * data ) ; proc.time()-ptm user system elapsed 0.187 0.387 0.574 I decided I should compare with the result I got using round(): ptm - proc.time() ; ints2 - as.integer( round( 1000 * data ) ) ; proc.time()-ptm user system elapsed 1.595 0.757 2.352 It is a curious fact that only a few of the values from 0 to 2000 disagree between the two methods: table( ints2[ ints2 != ints ] ) 1001 1003 1005 1007 1009 1011 1013 1015
Re: [R] rounding down with as.integer
Yes, Ted, that also works, but it's very slow: # read in values: data - scan( file=RECIP_IN, what=double(), nmax=recip_N*16000) Read 48013406 items # convert to integer by adding .5 and rounding down: ptm - proc.time() ; ints - as.integer( 1000 * data + .5 ) ; proc.time()-ptm user system elapsed 0.221 1.008 1.227 # convert to character, then to integer: ptm - proc.time() ; ints2 - as.integer( as.character( 1000 * data ) ) ; proc.time()-ptm user system elapsed 32.110 0.485 32.578 # the results are the same: identical(ints,ints2) [1] TRUE So they give the same answer, but converting to character takes about 25 times longer. Mike On Thu, 1 Jan 2015, ted.hard...@wlandres.net wrote: I've been followeing this little tour round the murkier bistros in the back-streets of R with interest! Then it occurred to me: What is wrong with [using example data]: x0 - c(0,1,2,0.325,1.12,1.9,1.003) x1 - as.integer(as.character(1000*x0)) n1 - c(0,1000,2000,325,1120,1900,1003) x1 - n1 ## [1] 0 0 0 0 0 0 0 ## But, of course: 1000*x0 - n1 ## [1] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 ## [5] 0.00e+00 0.00e+00 -1.136868e-13 Or am I missing somthing else in what Mike Miller is seeking to do? Ted. On 01-Jan-2015 19:58:02 Mike Miller wrote: I'd have to say thanks, but no thanks, to that one! ;-) The problem is that it will take a long time and it will give the same answer. The first time I did this kind of thing, a year or two ago, I manipulated the text data to produce integers before putting the data into R. The data were a little different -- already zero padded with three digits to the right of the decimal and one to the left, so all I had to do was drop the decimal point. The as.integer(1000*x+.5) method is very fast and it works great. I could have done that this time, but I was also saving to other formats, so I had the data already in the format I described. Mike On Thu, 1 Jan 2015, Richard M. Heiberger wrote: Interesting. Following someone on this list today the goal is input the data correctly. My inclination would be to read the file as text, pad each number to the right, drop the decimal point, and then read it as an integer. 0 1 2 0.325 1.12 1.9 0.000 1.000 2.000 0.325 1.120 1.900 1000 2000 0325 1120 1900 The pad step is the interesting step. ## 0 1 2 0.325 1.12 1.9 ## 0.000 1.000 2.000 0.325 1.120 1.900 ## 1000 2000 0325 1120 1900 x.in - scan(text= 0 1 2 0.325 1.12 1.9 1. , what=) padding - c(.000, 000, 00, 0, ) x.pad - paste(x.in, padding[nchar(x.in)], sep=) x.nodot - sub(., , x.pad, fixed=TRUE) x - as.integer(x.nodot) Rich On Thu, Jan 1, 2015 at 1:21 PM, Mike Miller mbmille...@gmail.com wrote: On Thu, 1 Jan 2015, Duncan Murdoch wrote: On 31/12/2014 8:44 PM, David Winsemius wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. I don't think the original problem was stated clearly, so I'm not sure whether this is a solution, but it looks wrong to me. If you want to round to the nearest integer, why not use round() (without the as.integer afterwards)? Or if you really do want an integer, why add 0.1 or 0.0001, why not add 0.5 before calling as.integer()? This is the classical way to implement round(). To state the problem clearly, I'd like to know what result is expected for any real number x. Since R's numeric type only approximates the real numbers we might not be able to get a perfect match, but at least we could quantify how close we get. Or is the input really character data? The original post mentioned reading numbers from a text file. Maybe you'd like to know what I'm really doing. I have 1600 text files each with up to 16,000 lines with 3100 numbers per line, delimited by a single space. The numbers are between 0 and 2, inclusive, and they have up to three digits to the right of the decimal. Every possible value in that range will occur in the data. Some examples numbers: 0 1 2 0.325 1.12 1.9. I want to multiply by 1000 and store them as 16-bit integers (uint16). I've been
[R] rounding down with as.integer
This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. Best, Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. Best, Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
How can I unsubscribe to not receive loop e mails? Sent from my Huawei Mobile David Winsemius dwinsem...@comcast.net wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. Best, Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding down with as.integer
Read the message at the bottom of every message from rhelp. -- David. On Dec 31, 2014, at 8:09 PM, Zaid Bhatti wrote: How can I unsubscribe to not receive loop e mails? Sent from my Huawei Mobile David Winsemius dwinsem...@comcast.net wrote: On Dec 31, 2014, at 3:24 PM, Mike Miller wrote: This is probably a FAQ, and I don't really have a question about it, but I just ran across this in something I was working on: as.integer(1000*1.003) [1] 1002 I didn't expect it, but maybe I should have. I guess it's about the machine precision added to the fact that as.integer always rounds down: as.integer(1000*1.003 + 255 * .Machine$double.eps) [1] 1002 as.integer(1000*1.003 + 256 * .Machine$double.eps) [1] 1003 This does it right... as.integer( round( 1000*1.003 ) ) [1] 1003 ...but this seems to always give the same answer and it is a little faster in my application: as.integer( 1000*1.003 + .1 ) [1] 1003 FYI - I'm reading in a long vector of numbers from a text file with no more than three digits to the right of the decimal. I'm converting them to integers and saving them in binary format. So just add 0.0001 or even .001 to all of them and coerce to integer. Best, Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.