[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-11-09 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

And I share Martin's concern about fast-forward with an unseekable underlying 
file. If this works in current code, we can't simply return break it. This may 
mean that we can't change the implementation of GzipFile.seekable() at all, 
even if it lies in some cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-11-09 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I share Martin's opinion that this is a misfeature. User code can check 
seekable() and use seek() if it returns True or cache necessary data in memory 
if it returns False, because it is expected that seek() is more efficient. But 
in case of GzipFile it is not efficient, and can lead to decompression the 
whole content of the file and to much worse performance.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-10-06 Thread Martin Panter

Martin Panter  added the comment:

If a change is made, it would be nice to bring the “gzip”, “bzip” and LZMA 
modules closer together. The current “bzip” and LZMA modules rely on the 
underlying “seekable” method without a fallback implementation, but also have a 
check for read mode.

I think the seeking functionality in these modules is a misfeature. But since 
it is already here, it is probably best to leave it alone, and just document it.

My comment about making “seekable” stricter is at 
. Even 
if the underlying stream is not seekable, GzipFile can still fast-forward. Here 
is a demonstration:

>>> z = BytesIO(bytes.fromhex(
... "1F8B080002FFF348CD29D051F05448CC55282E294DCE56C8CC53485448AFCA"
... "2C5048CBCC490500F44BF0A01F00"
... ))
>>> def seek(*args): raise UnsupportedOperation()
... 
>>> z.seek = seek  # Make the underlying stream not seekable
>>> f = GzipFile(fileobj=z)
>>> f.read(10)
b'Help, I am'
>>> f.seek(20)  # Fast forward
20
>>> f.read()
b'a gzip file'
>>> f.seek(0)  # Rewind
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/proj/python/cpython/Lib/gzip.py", line 368, in seek
return self._buffer.seek(offset, whence)
  File "/home/proj/python/cpython/Lib/_compression.py", line 137, in seek
self._rewind()
  File "/home/proj/python/cpython/Lib/gzip.py", line 515, in _rewind
super()._rewind()
  File "/home/proj/python/cpython/Lib/_compression.py", line 115, in _rewind
self._fp.seek(0)
  File "/home/proj/python/cpython/Lib/gzip.py", line 105, in seek
return self.file.seek(off)
  File "", line 1, in seek
io.UnsupportedOperation

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-10-05 Thread Cheryl Sabella


Cheryl Sabella  added the comment:

Allowing for non seekable files was added in issue1675951.  And under that 
issue in msg117131, the author of the change wrote:
"The patch creates another problem with is not yet fixed: The implementation of 
.seekable() is becoming wrong. As one can now use non seekable files the 
implementation should check if the file object used for reading is really 
seekable."

issue23529 made significant changes to the code and seekable() is again 
mentioned in msg239245 and subsequent comments.

Nosying the devs who worked on those issues.

--
nosy: +cheryl.sabella, martin.panter, pitrou
versions: +Python 3.8 -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-09-14 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-03-28 Thread Roundup Robot

Change by Roundup Robot :


--
keywords: +patch
pull_requests: +6021
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33173] GzipFile's .seekable() returns True even if underlying buffer is not seekable

2018-03-28 Thread Walt Askew

New submission from Walt Askew :

The seekable method on gzip.GzipFile always returns True, even if the 
underlying buffer is not seekable. However, if seek is called on the GzipFile, 
the seek will fail unless the underlying buffer is seekable. This can cause 
consumers of the GzipFile object to mistakenly believe calling seek on the 
object is safe, when in fact it will lead to an exception.

For example, this led to a bug when I was trying to use requests & boto3 to 
stream & decompress an S3 upload like so:

resp = requests.get(uri, stream=True)
decompressed = gzip.GzipFile(fileobj=resp.raw)
boto3.client('s3').upload_fileobj(decompressed, Bucket=bucket, Key=key)

boto3 checks the seekable method on the the GzipFile, chooses a code path based 
on the file being seekable but later raises an exception when the seek call 
fails because the underlying HTTP stream is not seekable.

--
components: Library (Lib)
messages: 314613
nosy: Walt Askew
priority: normal
severity: normal
status: open
title: GzipFile's .seekable() returns True even if underlying buffer is not 
seekable
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com