Terry J. Reedy added the comment: >it would probably be reasonable to make these protocols use str objects at the >heart, and only convert to bytes after the formatting is done.
I presume this would mean adding 'if py3: out = out.encode()' after the formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some actual numbers: for len in (0, 100, 1000, 10000, 100000): a = 'a' * len print(timeit("a.encode()", "from __main__ import a")) >>> 0.19305401378265558 0.22193721412302575 0.2783227054755883 0.677596406192696 7.124387897799184 Given n = 1000000, these should be microseconds per encoding. Of note: the copying of bytes does not double the total time until there are a few thousand chars. Would protocols be using .format for much more than this? [If speed is really an issue, we could make binary file/socket write methods unicode implementation aware. They could directly access the ascii (or latin-1) bytes in a unicode object, just as they do with a bytes object, and the extra copy could be skipped.] ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue3982> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com