[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: He accepted it already: A small last-minute optimization is not a release-blocker. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Larry, do you accept the patch for 3.5? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___ ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The patch is not so harmless. First, my change in BZ2File is not correct, because reading every line should be guarded with a lock (BZ2File is threading-safe). Second, for now all three compressing files are not only iterables, but iterators. iter(f)

[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Larry Hastings
Larry Hastings added the comment: Sounds good to me. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___ ___ Python-bugs-list mailing list

[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Martin Panter
Martin Panter added the comment: This patch adds an entry to the What’s New for 3.5 (though maybe it will have to be 3.6), and adds three tests to check that next() raises ValueError when the files have been closed. -- Added file: http://bugs.python.org/file39662/decomp-optim.v4.patch

[issue18003] lzma module very slow with line-oriented reading.

2015-06-09 Thread Martin Panter
Martin Panter added the comment: The BufferedReader class is documented as being thread safe: https://docs.python.org/dev/library/io.html#multi-threading. Some experimentation suggests that checking the “raw.closed” property is not actually serialized, but that raw.readinto() calls are

[issue18003] lzma module very slow with line-oriented reading.

2015-06-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: This looks good to me. -- stage: patch review - commit review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Perhaps this change is worth to mention in whatsnews. Could you add this Martin? It would be nice also add tests to ensure that next() after closing the file always raises ValueError. -- ___ Python tracker

[issue18003] lzma module very slow with line-oriented reading.

2015-06-06 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- priority: normal - release blocker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___ ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: bz2 will gain great benefit from such optimization too. Microbenchmark results: $ ./python -m timeit -s import gzip -- f=gzip.GzipFile('words.gz', 'r') for line in f: pass 2.7: 10 loops, best of 3: 374 msec per loop 3.2: 10

[issue18003] lzma module very slow with line-oriented reading.

2015-06-03 Thread Martin Panter
Martin Panter added the comment: Looking at https://bugs.python.org/file39586/decomp-optim.patch, the “closed” property is the first of the three hunks: 1. Adds @property / def closed(self) to Lib/_compression.py 2. Adds def __iter__(self) to Lib/gzip.py 3. Adds def __iter__(self) to

[issue18003] lzma module very slow with line-oriented reading.

2015-06-03 Thread Martin Panter
Martin Panter added the comment: New patch just fixes the spelling error in the comment. -- stage: needs patch - patch review Added file: http://bugs.python.org/file39604/decomp-optim.v2.patch ___ Python tracker rep...@bugs.python.org

[issue18003] lzma module very slow with line-oriented reading.

2015-06-03 Thread Larry Hastings
Larry Hastings added the comment: I don't see anything about closed in the patch you posted. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: Yes, this is right. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___ ___ Python-bugs-list mailing list

[issue18003] lzma module very slow with line-oriented reading.

2015-06-03 Thread Martin Panter
Martin Panter added the comment: Yes that’s basically right Larry. The __iter__() was previously inherited; now I am overriding it with a custom version. Similarly for the “closed” property, but that one is only a member of objects internal to the gzip, lzma and bz2 modules. --

[issue18003] lzma module very slow with line-oriented reading.

2015-06-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: Nous disions que tu aurais probablement à valider ce changement, mais que nous pourrions peut-être aussi le faufiler discrètement dans la base de code, vu que tu ne lis pas ces message. -- ___ Python tracker

[issue18003] lzma module very slow with line-oriented reading.

2015-06-02 Thread Larry Hastings
Larry Hastings added the comment: Quoi? Je comprends que le français. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003 ___ ___

[issue18003] lzma module very slow with line-oriented reading.

2015-06-02 Thread Larry Hastings
Larry Hastings added the comment: If I understand this correctly, I can ignore everything up to May 2015, as it has to do with line-reading a compressed binary file (!) being slow. Then, Martin Panter proposes a new optimization in May 2015, which is to simply add __iter__ methods to

[issue18003] lzma module very slow with line-oriented reading.

2015-06-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: This looks good to me. Larry would probably have to validate it for 3.5, although we may try to sneak it in (he isn't reading :-D). -- nosy: +larry ___ Python tracker rep...@bugs.python.org

[issue18003] lzma module very slow with line-oriented reading.

2015-06-01 Thread Martin Panter
Martin Panter added the comment: This bug was originally raised against Python 3.3, and the speed has improved a lot since then. Perhaps this bug can be closed as it is, or maybe people would like to consider my decomp-optim.patch which squeezes a bit more speed out. I don’t actually have a

[issue18003] lzma module very slow with line-oriented reading.

2015-01-11 Thread Martin Panter
Martin Panter added the comment: I haven’t done any tests, but my LZMAFile patch to Issue 15955 uses BufferedReader, so it might satisfy this issue -- nosy: +vadmium ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003

[issue18003] lzma module very slow with line-oriented reading.

2013-09-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: See issue19051. Even preliminary Python implementation noticeable speed up the reading of short lines. $ ./python -m timeit -s import lzma, io f=lzma.LZMAFile('words.xz', 'r') for line in f: pass Unpatched: 1.44 sec per loop Patched: 1.06 sec per loop

[issue18003] lzma module very slow with line-oriented reading.

2013-09-20 Thread Antoine Pitrou
Antoine Pitrou added the comment: With C implementation it should be as fast as with BufferedReader. So why not simply use BufferedReader? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18003

[issue18003] lzma module very slow with line-oriented reading.

2013-09-20 Thread Antoine Pitrou
Antoine Pitrou added the comment: So why not simply use BufferedReader? Because we want good performance LZMAFile and compatibility with older versions. You're reading me wrong. I'm simply suggesting that users interested in readline() performance wrap LZMAFile in a BufferedReader. The

[issue18003] lzma module very slow with line-oriented reading.

2013-09-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: So why not simply use BufferedReader? Because we want good performance LZMAFile and compatibility with older versions. And I guess that it will be even faster than wrapping in BufferedReader (due to the avoiding of double buffering). --

[issue18003] lzma module very slow with line-oriented reading.

2013-05-24 Thread Éric Araujo
Éric Araujo added the comment: A higher-level interface to abstract differences between gzip, xz and others is actually provided in the tarfile module. (zipfile is left out and its file objects have different methods, but that’s another issue. shutil provides even higher-level functions to