On Wed, 24 Nov 2010 18:51:49 +0900
"Stephen J. Turnbull" <step...@xemacs.org> wrote:
> James Y Knight writes:
> 
>  > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly
>  > superior [...]a because it is an ASCII superset, and thus more
>  > easily compatible with other software. That also makes it most
>  > commonly used for internet communication.
> 
> Sure, UTF-8 is very nice as a protocol for communicating text.  So
> what?  If your application involves shoveling octets real fast, don't
> convert and shovel those octets.  If your application involves
> significant text processing, well, conversion can almost always be
> done as fast as you can do I/O so it doesn't cost wallclock time, and
> generally doesn't require a huge percentage of CPU time compared to
> the actual text processing.  It's just a specialization of
> serialization, that we do all the time for more complex data
> structures.
> 
> So wire protocols are not a killer argument for or against any
> particular internal representation of text.

Agreed. Decoding and encoding utf-8 is so fast that it should be
dwarfed by any actual processing done on the text.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to