On Wed, 24 Nov 2010 18:51:49 +0900 "Stephen J. Turnbull" <step...@xemacs.org> wrote: > James Y Knight writes: > > > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly > > superior [...]a because it is an ASCII superset, and thus more > > easily compatible with other software. That also makes it most > > commonly used for internet communication. > > Sure, UTF-8 is very nice as a protocol for communicating text. So > what? If your application involves shoveling octets real fast, don't > convert and shovel those octets. If your application involves > significant text processing, well, conversion can almost always be > done as fast as you can do I/O so it doesn't cost wallclock time, and > generally doesn't require a huge percentage of CPU time compared to > the actual text processing. It's just a specialization of > serialization, that we do all the time for more complex data > structures. > > So wire protocols are not a killer argument for or against any > particular internal representation of text.
Agreed. Decoding and encoding utf-8 is so fast that it should be dwarfed by any actual processing done on the text. Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com