2009/1/28 Antoine Pitrou <solip...@pitrou.net>: > If you look at how utf-8 decoding is implemented (in unicodeobject.c), it's > quite obvious why it is so :-) There is a (very) fast path for chunks of pure > ASCII data, and (fast but not blazingly fast) fallback for non ASCII data.
Thanks for the explanation. > Please don't think of it as a slowdown... It's still much faster than 2.x, > which > manages 130MB/s on the same data. Don't get me wrong - I'm hugely grateful for this work. And personally, I don't expect that I/O speed is ever likely to be a real bottleneck in the type of program I write. But I'm concerned that (much as with the whole "Python 3.0 is incompatible, and it will be hard to port to" meme) people will pick up on raw benchmark figures - no matter how much they aren't comparing like with like - and start making it sound like "Python 3.0 I/O is slower than 2.x" - which is a great disservice to the good work that's been done. I do think it's worth taking care over the default encoding, though. Quite apart from performance, getting "correct" behaviour is important. I can't speak for Unix, but on Windows, the following behaviour feels like a bug to me: >echo a£b >a1 >python Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> print open("a1").read() a£b >>> ^Z >\Apps\Python30\python.exe Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> print(open("a1").read()) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\Apps\Python30\lib\io.py", line 1491, in write b = encoder.encode(s) File "D:\Apps\Python30\lib\encodings\cp850.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0153' in position 1: character maps to <undefined> >>> ^Z >chcp Active code page: 850 Paul. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com