Robert Bradshaw wrote: > One of the reasons I was so quick to discard this is > because I thought the usecase was that null characters needed to be > embedded, which is completely orthogonal, and I couldn't think of > anywhere I'd come across unsigned char* used for strings (but clearly > libxml2 is such a library).
I actually never understood why people use plain char* in the first place (ok, apart from tradition, laziness and non-ASCII unawareness). Any 1-byte encoding table I've ever come across maps characters to the byte values 0-255 or 0x00-0xFF. I've never seen an encoded byte string represented with negative byte values. The habit of using char* for text goes so far that I wasn't even aware that char* was pointing to a signed value when I learned C. Before I was made aware of it, I just unconsciously considered 'char' a special case in the language (which it actually is when you think about it). > Just out of curiosity, does it use char* for ASCII and unsigned char* > for utf-8 as a poor-man's typechecking for encoding? It's a form of type-checking, yes, but not in that way. It uses unsigned char* for text (tag names, text values, etc.) and char* for data sequences (e.g. file names and serialised XML). It even redefines "unsigned char" as "xmlChar" for that purpose, and a macro "BAD_CAST" that does exactly what it sounds like. I guess the historical reason to do that was that you can (or could?) switch the internal text encoding in libxml2, so you could use Latin-1 instead of UTF-8, for example, and the xmlChar* would denote all strings encoded that way. Doesn't make much sense for XML nowadays and just little more for HTML, but it's still a nice way of documenting the API. And to me, it makes sense to use "unsigned char" anyway. >> I don't think there's anything wrong with letting Cython do the >> necessary casting under the hood. > > http://trac.cython.org/cython_trac/ticket/359 Thanks. Stefan _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
