Re: [Python-Dev] bytes / unicode

Terry Reedy Sun, 20 Jun 2010 16:35:34 -0700

On 6/20/2010 5:55 PM, Benjamin Peterson wrote:

2010/6/20 Antoine Pitrou<solip...@pitrou.net>:

On Sun, 20 Jun 2010 14:40:56 -0400
"P.J. Eby"<p...@telecommunity.com>  wrote:


Actually, I would say that it's more that (in the network protocol
case) we *have* bytes, some of which we would like to *treat* as
text, yet do not wish to constantly convert back and forth to
full-blown unicode


Well, then why don't you just stick with a bytes object?


There are not many tools for treating bytes as text.


If one writes a function (most easily in Python)

1. in terms of the methods and operations shared by unicode and bytes,which is nearly all of them, and2. does not gratuitously (and dare I say, unpythonically) do a classcheck to unnecessarily exclude one or the other, and3. does not specialize by assuming only one of the possible values fortype-specific constants, such as number of chars/codes, and

4. does not do something unicode specific such as normalization,
then the function should be agnostic and operate generically.

I think there was some temptation to be 'pure' and limit text methods tostr and enforce the decode-manipulate-encode paradigm (which isextremely common in various forms, and nothing unusual). But forpracticality and efficiency, that was not done.

Do you have in mind any tools that could and should operate on both, butdo not? (I realize that at the C level, code is not just specialized to'unicode', but to 2-byte versus 4-byte representations.)


Terry Jan Reedy


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

Reply via email to