On Fri, Jun 24, 2011 at 11:05, Nathaniel Smith <[email protected]> wrote: > On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern <[email protected]> wrote: >> On Fri, Jun 24, 2011 at 10:07, Laurent Gautier <[email protected]> wrote: >>> May be there is not so much need for reservation over the string NA, when >>> making the distinction between: >>> a- the internal representation of a "missing string" (what is stored in >>> memory, and that C-level code would need to be aware of) >>> b- the 'external' representation of a missing string (in Python, what would >>> be returned by repr() ) >>> c- what is assumed to be a missing string value when reading from a file. >>> >>> a/ is not 'NA', c/ should be a parameter in the relevant functions, b/ can >>> be configured as a module-level, class-level, or instance-level variable. >> >> In R, a/ happens to be 'NA', unfortunately. :-/ >> >> I'm not really sure how they handle datasets that use valid 'NA' >> values. Presumably, their input routines allow one to convert such >> values to something else such that it can use 'NA'==NA internally. > > No, R can distinguish the string "NA" and the value NA-of-type-string: > >> c("NA", NA) > [1] "NA" NA > > In R strings are represented as pointers, rather than in-place, and > the magic NA value has a special globally known pointer value. (This > pointer might well point to the characters "NA\0", but all of the code > knows to check whether it has the magic NA pointer before actually > following the pointer.)
Ah, okay. Well, then we can pick whatever value we like. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
