On Wed, 15 Jun 2016, Greg Ewing wrote:
Simon Cross wrote:
If we only support one, I would prefer it to be bytes since (bytes ->
bytes -> unicode) seems like less overhead and slightly conceptually
clearer than (bytes -> unicode -> bytes),
Whereas bytes -> unicode, followed if needed by unicode -> bytes,
seems conceptually clearer to me. IOW, base64 is conceptually a
bytes-to-text transformation, and the usual way to represent
text in Python 3 is unicode.
And in CPython, do I understand correctly that the output text would be
represented using one byte per character? If so, would there be a way of
encoding that into UTF-8 that re-used the raw memory that backs the
Unicode object? And, therefore, avoids almost all the inefficiency of
going via Unicode? If so, this would be a win - proper use of Unicode to
represent a text string, combined with instantaneous conversion into a
bytes object for the purpose of writing to the OS.
Isaac Morland CSCF Web Guru
DC 2619, x36650 WWW Software Specialist
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com