Robert Bradshaw wrote:
> One of the reasons I was so quick to discard this is  
> because I thought the usecase was that null characters needed to be  
> embedded, which is completely orthogonal, and I couldn't think of  
> anywhere I'd come across unsigned char* used for strings (but clearly  
> libxml2 is such a library).

I actually never understood why people use plain char* in the first place
(ok, apart from tradition, laziness and non-ASCII unawareness). Any 1-byte
encoding table I've ever come across maps characters to the byte values
0-255 or 0x00-0xFF. I've never seen an encoded byte string represented with
negative byte values. The habit of using char* for text goes so far that I
wasn't even aware that char* was pointing to a signed value when I learned
C. Before I was made aware of it, I just unconsciously considered 'char' a
special case in the language (which it actually is when you think about it).


> Just out of curiosity, does it use char* for ASCII and unsigned char*  
> for utf-8 as a poor-man's typechecking for encoding?

It's a form of type-checking, yes, but not in that way. It uses unsigned
char* for text (tag names, text values, etc.) and char* for data sequences
(e.g. file names and serialised XML). It even redefines "unsigned char" as
"xmlChar" for that purpose, and a macro "BAD_CAST" that does exactly what
it sounds like.

I guess the historical reason to do that was that you can (or could?)
switch the internal text encoding in libxml2, so you could use Latin-1
instead of UTF-8, for example, and the xmlChar* would denote all strings
encoded that way. Doesn't make much sense for XML nowadays and just little
more for HTML, but it's still a nice way of documenting the API. And to me,
it makes sense to use "unsigned char" anyway.


>> I don't think there's anything wrong with letting Cython do the  
>> necessary casting under the hood.
> 
> http://trac.cython.org/cython_trac/ticket/359

Thanks.

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to