[issue4561] Optimize new io library

2009-01-18 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Marking this as a duplicate of #4565 Rewrite the IO stack in C. -- resolution: - duplicate status: open - closed superseder: - Rewrite the IO stack in C ___ Python tracker rep...@bugs.python.org

[issue4561] Optimize new io library

2008-12-20 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: We can't solve this for 3.0.1, downgrading to critical. -- priority: release blocker - critical ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4561

[issue4561] Optimize new io library

2008-12-19 Thread Martin v. Löwis
Changes by Martin v. Löwis mar...@v.loewis.de: -- priority: deferred blocker - release blocker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-16 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: The previous implementation only returns bytes and does not translate newlines. For this particular case, indeed, the plain old FILE* based object is faster. -- nosy: +amaury.forgeotdarc ___

[issue4561] Optimize new io library

2008-12-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I know that as hard as it might be for everyone to believe, there are a lot of people who crank lots of non- Unicode data with Python. But cranking data implies you'll do something useful with it, and therefore spend CPU time doing those

[issue4561] Optimize new io library

2008-12-16 Thread David M. Beazley
David M. Beazley beaz...@users.sourceforge.net added the comment: I wish I shared your optimism about this, but I don't. Here's a short explanation why. The problem of I/O and the associated interface between hardware, the operating system kernel, and user applications is one of the most

[issue4561] Optimize new io library

2008-12-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I seem to recall one of the design principles of the new IO stack was to avoid relying on the C stdlib's buffered API, which has too many platform-dependant behaviours. In any case, binary reading has acceptable performance in py3k (although

[issue4561] Optimize new io library

2008-12-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I don't agree that that was a worthy design goal. I don't necessarily agree either, but it's probably too late now. The py3k buffered IO object has additional methods (e.g. peek(), read1()) which can be used by upper layers (text IO) and so

[issue4561] Optimize new io library

2008-12-16 Thread David M. Beazley
David M. Beazley beaz...@users.sourceforge.net added the comment: Good luck with that. Most people who get bright ideas such as gee, maybe I'll write my own version of X where X is some part of the standard C library pertaining to I/O, end up fighting a losing battle. Of course, I'd love

[issue4561] Optimize new io library

2008-12-16 Thread David M. Beazley
David M. Beazley beaz...@users.sourceforge.net added the comment: I agree with Raymond. For binary reads, I'll go farther and say that even a 10% slowdown in performance would be surprising if not unacceptable to some people. I know that as hard as it might be for everyone to believe,

[issue4561] Optimize new io library

2008-12-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: [...] Although I agree all this is important, I'd challenge the assumption it has its place in the buffered IO library rather than in lower-level layers (i.e. kernel userspace unbuffered IO). In any case, it will be difficult to undo the

[issue4561] Optimize new io library

2008-12-16 Thread Christian Heimes
Christian Heimes li...@cheimes.de added the comment: David: Amaury's work is going to be a part of the standard library as soon as his work is done. I'm confident that we can reach the old speed of the 2.x file type by carefully moving code to C modules. ___

[issue4561] Optimize new io library

2008-12-15 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I've written a small file IO benchmark, available here: http://svn.python.org/view/sandbox/trunk/iobench/ It runs under both 2.6 and 3.x, so that we can compare speeds of respective implementations. ___ Python

[issue4561] Optimize new io library

2008-12-15 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Without Christian's patch: [400KB.txt] read one byte/char at a time... 0.2685 MB/s (100% CPU) [400KB.txt] read 20 bytes/chars at a time... 4.536 MB/s (98% CPU) [400KB.txt] read one line at a time...3.805 MB/s

[issue4561] Optimize new io library

2008-12-15 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: I'm getting caught-up with the IO changes in 3.0 and am a bit confused. The PEP says, programmers who don't want to muck about in the new I/O world can expect that the open() factory method will produce an object

[issue4561] Optimize new io library

2008-12-13 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Christian, by benchmarks I meant a measurement of text reading with and without the patch. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-10 Thread Martin v. Löwis
Changes by Martin v. Löwis [EMAIL PROTECTED]: -- priority: release blocker - deferred blocker ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-10 Thread Ismail Donmez
Changes by Ismail Donmez [EMAIL PROTECTED]: -- nosy: +cartman ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___ ___ Python-bugs-list mailing list

[issue4561] Optimize new io library

2008-12-07 Thread Winfried Plappert
Changes by Winfried Plappert [EMAIL PROTECTED]: -- nosy: +wplappert ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___ ___ Python-bugs-list mailing

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
New submission from Christian Heimes [EMAIL PROTECTED]: The new io library needs some serious profiling and optimization work. I've already fixed a severe slowdown in _fileio.FileIO's read buffer allocation algorithm (#4533). More profiling tests have shown a speed problem in write() files

[issue4561] Optimize new io library

2008-12-06 Thread David M. Beazley
David M. Beazley [EMAIL PROTECTED] added the comment: I've done some profiling and the performance of reading line-by-line is considerably worse in Python 3 than in Python 2. For example, this code: for line in open(somefile.txt): pass Ran 35 times slower in Python 3.0 than Python 2.6

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Christian Heimes [EMAIL PROTECTED] added the comment: Your issue is most like caused by #4533. Please download the latest svn version of Python 3.0 (branches/release30_maint) and try again. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Christian Heimes [EMAIL PROTECTED] added the comment: Here is a patch againt the py3k branch that reduces the time for the line ending detection from 0.55s to 0.22s for a 50MB file on my test system. -- keywords: +patch Added file:

[issue4561] Optimize new io library

2008-12-06 Thread David M. Beazley
David M. Beazley [EMAIL PROTECTED] added the comment: Tried this using projects/python/branches/release30-maint and using the patch that was just attached. With a 66MB input file, here are the results of this code fragment: for line in open(BIGFILE): pass Python 2.6: 0.67s Python 3.0:

[issue4561] Optimize new io library

2008-12-06 Thread David M. Beazley
David M. Beazley [EMAIL PROTECTED] added the comment: Just as one other followup, if you change the code in the last example to use binary mode like this: for line in open(BIG,rb): pass You get the following results: Python 2.6: 0.64s Python 3.0: 42.26s (66 times slower)

[issue4561] Optimize new io library

2008-12-06 Thread Georg Brandl
Georg Brandl [EMAIL PROTECTED] added the comment: David, the reading bug fix/optimization is not (yet?) on release30-maint, only on branches/py3k. -- nosy: +georg.brandl ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561

[issue4561] Optimize new io library

2008-12-06 Thread David M. Beazley
David M. Beazley [EMAIL PROTECTED] added the comment: Just checked it with branches/py3k and the performance is the same. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-06 Thread David M. Beazley
David M. Beazley [EMAIL PROTECTED] added the comment: bash-3.2$ uname -a Darwin david-beazleys-macbook.local 9.5.1 Darwin Kernel Version 9.5.1: Fri Sep 19 16:19:24 PDT 2008; root:xnu-1228.8.30~1/RELEASE_I386 i386 bash-3.2$ ./python.exe -c import sys; print(sys.version) 3.1a0 (py3k:67609, Dec 6

[issue4561] Optimize new io library

2008-12-06 Thread Giampaolo Rodola'
Changes by Giampaolo Rodola' [EMAIL PROTECTED]: -- nosy: +giampaolo.rodola ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___ ___ Python-bugs-list

[issue4561] Optimize new io library

2008-12-06 Thread Antoine Pitrou
Changes by Antoine Pitrou [EMAIL PROTECTED]: -- nosy: +pitrou ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___ ___ Python-bugs-list mailing list

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Changes by Christian Heimes [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file12248/count_linenendings.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-06 Thread Antoine Pitrou
Antoine Pitrou [EMAIL PROTECTED] added the comment: I don't think this is a public API, so the function should probably be renamed _count_lineendings. Also, are there some benchmark numbers? ___ Python tracker [EMAIL PROTECTED]

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Christian Heimes [EMAIL PROTECTED] added the comment: I'll come up with some reading benchmarks tomorrow. For now here is a benchmark of write(). You can clearly see the excessive usage of closed, len() and isinstance(). Added file: http://bugs.python.org/file12256/test_write.log

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Changes by Christian Heimes [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file12256/test_write.log ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___

[issue4561] Optimize new io library

2008-12-06 Thread Christian Heimes
Christian Heimes [EMAIL PROTECTED] added the comment: Roundup doesn't display .log files as plain text files. Added file: http://bugs.python.org/file12257/test_write.txt ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561

[issue4561] Optimize new io library

2008-12-06 Thread Barry A. Warsaw
Changes by Barry A. Warsaw [EMAIL PROTECTED]: -- priority: high - release blocker ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4561 ___ ___