[issue10791] Wrapping TextIOWrapper around gzip files
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Nadeem Vawda nadeem.va...@gmail.com added the comment: Here's an implementation of read1() that satisfies that condition, along with some relevant unit tests. -- keywords: +patch Added file: http://bugs.python.org/file21531/gzipfile_read1.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: Here's an implementation of read1() that satisfies that condition, along with some relevant unit tests. Something looks fishy: what happens if size is -1 and EOFError is not raised? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Nadeem Vawda nadeem.va...@gmail.com added the comment: Something looks fishy: what happens if size is -1 and EOFError is not raised? You're right - I missed that possibility. In that case, extrasize and offset get updated incorrectly, which will break subsequent calls to seek() and tell(). However, it seems that subsequent reads work fine, because slicing a bytes object with a too-large upper bound doesn't raise an exception. The attached patch fixes this bug, and updates test_read1() to catch regressions. -- Added file: http://bugs.python.org/file21532/gzipfile_read1.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Roundup Robot devnull@devnull added the comment: New changeset 9775d67c9af9 by Antoine Pitrou in branch 'default': Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile http://hg.python.org/cpython/rev/9775d67c9af9 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: Patch now committed, thank you! Since the patch adds a new API (GzipFile.read1()), I think it's better not to backport it. -- resolution: - fixed stage: needs patch - committed/rejected status: open - closed type: behavior - feature request versions: -Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Nadeem Vawda nadeem.va...@gmail.com added the comment: Is following change in GzipFile class enough: def read1(self, n): return self.read(n) ? This satisfies TextIOWrapper to run readline correctly. Looks good to me. By the way, BZ2File now works correctly - the fix for issue5863 adds read1(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: Nadeem Vawda nadeem.va...@gmail.com added the comment: Is following change in GzipFile class enough: def read1(self, n): return self.read(n) ? This satisfies TextIOWrapper to run readline correctly. Looks good to me. Well, ideally, read1() should satisfy the condition stated in the BufferedIOBase documentation - namely, that it issues at most one read() call on the underlying stream. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Changes by Nadeem Vawda nadeem.va...@gmail.com: -- nosy: +nvawda ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Filip Gruszczyński grusz...@gmail.com added the comment: Is following change in GzipFile class enough: def read1(self, n): return self.read(n) ? This satisfies TextIOWrapper to run readline correctly. -- nosy: +gruszczy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: Bump. This is still broken in Python 3.2. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: If a patch had been proposed it probably would have gotten in to 3.2. Maybe someone (perhaps you?) will find the time before 3.2.1. Someone has decided to work on the bz2 rewrite, by the way (issue 5863). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: If I can find some time, I may took a look at this. I just noticed that similar problems arise trying to wrap TextIOWrapper around the file-like objects returned by urllib.request.urlopen as well. In the big picture, some discussion of what it means to be file-like might be in order. If something is file-like and binary, should that always imply that I be able to wrap a TextIOWrapper object around it in order to encode/decode text? I would argue yes, but I'd be curious to know what others think. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
STINNER Victor victor.stin...@haypocalc.com added the comment: What is the problem with Python 3.2? It works correctly here: $ cat bla.txt bli blo bla $ gzip bla.txt $ ./python Python 3.3a0 (unknown, Feb 23 2011, 13:03:50) import gzip, io f = io.TextIOWrapper(gzip.open(bla.txt.gz),encoding='ascii') f.read() 'bli\nblo\nbla\n' If someone added Python 3.2 in the Versions field because of an issue with bz2: please open a new issue instead. -- nosy: +haypo versions: -Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: Python 3.2 (r32:88445, Feb 20 2011, 21:51:21) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import gzip import io f = io.TextIOWrapper(gzip.open(file.gz),encoding='latin-1') f.readline() Traceback (most recent call last): File stdin, line 1, in module io.UnsupportedOperation: read1 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: Yes, a clear definition of the minimum requirements for being wrapped by TextIOWrapper sounds like a necessary thing to have (and I'd be inclined to agree with your assertion, but I didn't work on the IO library :). It would be best to open a new issue for that. -- versions: +Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
STINNER Victor victor.stin...@haypocalc.com added the comment: Yes, a clear definition of the minimum requirements for being wrapped by TextIOWrapper sounds like a necessary thing to have About that: is read1() argument mandatory or not? In _pyio, BufferedIOBase.read1() argument is optional (default: None); BytesIO.read1(), BufferedReader.read1(), BufferedRWPair.read1(), BufferedRandom.read1() argument is mandatory. In _io, BufferedIOBase.read1() raises directly an exception, without checking the arguments; BufferedReader.read1() argument is mandatory. In the io doc, BufferedIOBase.read1() argument is optional (default: -1), BytesIO.read1() has no argument (!) and BufferedReader.read1() argument is mandatory. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: It would probably be ok to fallback on read() when read1() isn't implemented. read1() is supposed to be implemented by all BufferedIO-compliant classes, but in all honesty I don't think it's very useful in practice. It's supposed to be an optimization, and I think it's a misguided one; the generalized prefetch() primitive I proposed last year would certainly be more useful: see http://mail.python.org/pipermail/python-dev/2010-September/104194.html -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Changes by Alex alex.gay...@gmail.com: -- nosy: +alex ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Changes by Éric Araujo mer...@netwok.org: -- nosy: +eric.araujo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
New submission from David Beazley d...@dabeaz.com: Is something like this supposed to work: import gzip import io f = io.TextIOWrapper(gzip.open(foo.gz),encoding='ascii')) Traceback (most recent call last): File stdin, line 1, in module AttributeError: readable In a nutshell--reading a .gz file as text. -- messages: 124870 nosy: dabeaz priority: normal severity: normal status: open title: Wrapping TextIOWrapper around gzip files type: behavior versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: Since GZipFile inherits from BufferedIOBase, and TextIOWrapper is supposed to be designed to wrap a BufferedIOBase object, I would say yes it ought to work. On the other hand there may also be a doc error there, since it may be that TextIOWrapper actually needs to wrap one of the subclasses of BufferedIOBase. -- nosy: +pitrou, r.david.murray stage: - needs patch versions: +Python 2.7, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: Oops. It only has that inheritance in 3.2. -- versions: -Python 2.7, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: Heh, and 2.7. Fixing versions yet again. -- versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: This should be easy to fix, if only the readable and writable methods are needed. Do you want to try writing a patch? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: It goes without saying that this also needs to be checked with the bz2 module. A quick check seems to indicate that it has the same problem. While you're at it, maybe someone could add an 'open' function to bz2 to make it symmetrical with gzip as well :-). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: bz2 is a pure C module, so that's a very different situation. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: While you're at it, maybe someone could add an 'open' function to bz2 to make it symmetrical with gzip as well :-). That's a nice idea, but quite orthogonal to this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still seems like something that someone might want to do. I doubt it would take much modification to give BZ2File instances the required set of methods. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
R. David Murray rdmur...@bitdance.com added the comment: Right, but in the bz2 case I think it is a feature request rather than a bugfix. In any case it should be a separate issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still seems like something that someone might want to do. I doubt it would take much modification to give BZ2File instances the required set of methods. BZ2File uses FILE pointers internally so it may be more complicated than it looks to be (because the methods may not have the right semantics). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: Do Python devs really view gzip and bz2 as two totally completely different animals? They both have the same functionality and would be used for the same kinds of things. Maybe I'm missing something. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
Antoine Pitrou pit...@free.fr added the comment: Do Python devs really view gzip and bz2 as two totally completely different animals? They both have the same functionality and would be used for the same kinds of things. Maybe I'm missing something. Well, the reality of divergent implementation strategies trumps the theory of API compatibility :) The approach taken by bz2 is IMO regrettable, but it's not a ten minutes job to write it again from scratch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley d...@dabeaz.com added the comment: Hmmm. Interesting. In the big picture, it might be an interesting project for someone (not necessarily the core devs) to sit down and refactor both of these modules so that they play nice with Python 3 I/O system. Obviously that's a project outside the scope of this bug or the 3.2 release for that matter. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10791 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com