Terry J. Reedy added the comment:

>it would probably be reasonable to make these protocols use str objects at the 
>heart, and only convert to bytes after the formatting is done.

I presume this would mean adding 'if py3: out = out.encode()' after the 
formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some 
actual numbers:

for len in (0, 100, 1000, 10000, 100000):
    a = 'a' * len
    print(timeit("a.encode()", "from __main__ import a"))
>>> 
0.19305401378265558
0.22193721412302575
0.2783227054755883
0.677596406192696
7.124387897799184

Given n = 1000000, these should be microseconds per encoding. Of note: 
the copying of bytes does not double the total time until there are a few 
thousand chars. Would protocols be using .format for much more than this?

[If speed is really an issue, we could make binary file/socket write methods 
unicode implementation aware. They could directly access the ascii (or latin-1) 
bytes in a unicode object, just as they do with a bytes object, and the extra 
copy could be skipped.]

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue3982>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to