On Wed, Aug 12, 2009 at 04:02:28PM +0200, Martin Maechler wrote: > >>>>> "PS" == Petr Savicky <savi...@cs.cas.cz> > >>>>> on Wed, 12 Aug 2009 13:50:46 +0200 writes: > > PS> Let me add the following to the discussion of identical(0, -0). > PS> I would like to suggest to replace the paragraph > > PS> 'identical' sees 'NaN' as different from 'NA_real_', but all > PS> 'NaN's are equal (and all 'NA' of the same type are equal). > > PS> in ?identical by the following text, which is a correction of my > previous > PS> suggestion for the same paragraph > > > Components of numerical objects are compared as follows. For non-missing > > values, "==" is used. In particular, '0' and '-0' are considered equal. > > All 'NA's of the same type are equal and all 'NaN's are equal, although > > their bit patterns may differ in some cases. 'NA' and 'NaN' are always > > different. > > Note also that 1/0 and 1/(-0) are different. > > the 'numerical' would have to be qualified ('double', 'complex' > via double), as indeed, memcmp() is used on integers > > The last sentence is not necessary and probably even confusing: > Of course, -Inf and Inf are different.
I agree. > PS> The suggestion for the default of identical(0, -0) is TRUE, because > the > PS> negative zero is much less important than NA na NaN and, possibly, > PS> distinguishing 0 and -0 could even be deprecated. > > What should that mean?? R *is* using the international floating > point standards, and 0 and -0 exist there and they *are* > different! I am sorry for being too short. In my opinion, distinguishing 0 and -0 is not useful enough to make the default behavior of identical() different from the behavior of == in this case. > If R would start --- with a performance penalty, btw ! --- > to explicitly map all internal '-0' into '+0' we would > explicitly move away from the international FP standards... > no way! Yes, i agree. I did not meant this. > PS> Moreover, the argument > PS> of efficiency of memcmp cannot be used here, since there are different > PS> variants of NaN and NA, which should not be distinguished by default. > > your argument is only partly true... as memcmp() can still be > used instead of '==' *after* the NA-treatments {my current > patch does so}, OK. In this case, memcmp() could still be faster than ==, although this is beyond my knowledge. > and even more as I have been proposing an option "strict" which > would only use memcmp() {and hence also distinguish different > NA, NaN's}. I understand the previous messages in this thread as that there is an agreement that such an option would be very useful and would lead to faster comparison. Petr. ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel