On Tue, Aug 11, 2009 at 10:04:20AM +0200, Martin Maechler wrote: > >>>>> "DM" == Duncan Murdoch <murd...@stats.uwo.ca> > >>>>> on Mon, 10 Aug 2009 11:51:53 -0400 writes: > > DM> For people who want to play with these, here are some functions that > let > DM> you get or set the "payload" value in a NaN. NaN and NA, Inf and > -Inf > DM> are stored quite similarly; these functions don't distinguish which > of > DM> those you're working with. Regular finite values give NA for the > DM> payload value, and elements of x are unchanged if you try to set > their > DM> payload to NA. > > DM> By the way, this also shows that R *can* distinguish different NaN > DM> values, but you need some byte-level manipulations. > > yes; very nice code, indeed! > > I propose a version of the showBytes() utility should be added > either as an example e.g. in writeBin() or even an exported > function in package 'utils' > > [.........] > > > Example: > > >> x <- c(NA, NaN, 0, 1, Inf) > >> NaNpayload(x) > > [1] 0.5 -0.5 NA NA 0.0 > > Interestingly, on 64-bit, I get a slightly different answer above, > (when all the following code gives exactly the same results, > and of course, that was your main point !), namely > 4.338752e-13 instead of 0.5 for 'NA', > see below. > > .. and your nice tools also let me detect an even simpler way > to get *two* versions of NA, and NaN, each : > Conclusion: Both NaN and NA (well NA_real_) have a sign, too ! > > NaNpayload(NA_real_) > ##[1] 4.338752e-13 > NaNpayload(-NA_real_) > ##[1] -4.338752e-13 ## !! different > > str(NApm <- c(1[2], -1[2])) > t(sapply(NApm, showBytes)) > ## [1,] a2 07 00 00 00 00 f0* 7f > ## [2,] a2 07 00 00 00 00 f0* ff > > ## or summarizing things : > > ## Or, "in summary" -- Duncan's original example slightly extended: > x <- c(NaN, -NaN, NA, -NA_real_, 0, 0.1, Inf, -Inf) > x > names(x) <- format(x) > sapply(x, showBytes) > ## NaN NaN NA NA 0.0 0.1 Inf -Inf > ## [1,] 00 00 a2 a2 00 9a 00 00 > ## [2,] 00 00 07 07 00 99 00 00 > ## [3,] 00 00 00 00 00 99 00 00 > ## [4,] 00 00 00 00 00 99 00 00 > ## [5,] 00 00 00 00 00 99 00 00 > ## [6,] 00 00 00 00 00 99 00 00 > ## [7,] f8 f8 f8* f8* 00 b9 f0 f0 > ## [8,] ff 7f 7f ff 00 3f 7f ff > > ## (*) NOTE: the 'f0*' or 'f8*' above are > ## --- 'f8' on 32-bit, 'f0' on 64-bit > > > > >> NaNpayload(x) <- -0.4 > >> x > > [1] NaN NaN NaN NaN NaN > >> y <- x > >> NaNpayload(y) <- 0.6 > >> y > > [1] NaN NaN NaN NaN NaN > >> NaNpayload(x) > > [1] -0.4 -0.4 -0.4 -0.4 -0.4 > >> NaNpayload(y) > > [1] 0.6 0.6 0.6 0.6 0.6 > >> identical(x, y) > > [1] TRUE >
The above examples convince me that the default behavior of identical() should not be based on bit patterns, since the differences between different NaN's or even different NA's are irrelevant except if we use the bit manipulations explicitly. Let me suggest the following short description in ?identical The safe and reliable way to test two objects for being equal in structure, types of components and their values. It returns 'TRUE' in this case, 'FALSE' in every other case. and replacing the paragraph 'identical' sees 'NaN' as different from 'NA_real_', but all 'NaN's are equal (and all 'NA' of the same type are equal). in ?identical by Comparison of objects of numeric type uses '==' for comparison of their components. This means that the values of the components rather than their machine representation is compared. In particular, '0' and '-0' are considered equal, all 'NA's of the same type are equal and all 'NaN's are equal, although their bit patterns may differ in some cases. 'NA' and 'NaN' are always different. Note also that 1/0 and 1/(-0) are different. Petr. ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel