[issue20906] Issues in Unicode HOWTO

Antoine Pitrou Sun, 16 Mar 2014 10:36:37 -0700

Antoine Pitrou added the comment:

Do you want to provide a patch?


> In a narrative such as the current article, a code point value is usually 
> written in hexadecimal.

I find use of the word "narrative" intimidating in the context of a technical 
documentation.

In general, I find it disappointing that the Unicode HOWTO only gives 
hexadecimal representations of non-ASCII characters and (almost) never 
represents them in their true form. This makes things more abstract than 
necessary.

> This is a vague claim. Probably what was intended was: "Many Internet 
> standards define protocols in which the data must contain no zero bytes, or 
> zero bytes have special meaning."  Is this actually true? Are there "many" 
> such standards?

I think it actually means that Internet protocols assume an ASCII-compatible 
encoding (which UTF-8 is, but not UTF-16 or UTF-32 - nor EBCDIC :-)).

> --> "Non-Unicode code systems usually don't handle all of the characters to 
> be found in Unicode."

The term *encoding* is used pervasively when dealing with the transformation of 
unicode to/from bytes, so I find it confusing to introduce another term here 
("code systems"). I prefer the original sentence.

----------
nosy: +akuchling

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue20906>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20906] Issues in Unicode HOWTO

Reply via email to