[issue29094] Regression in zipfile writing in 2.7.13

2017-05-18 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> fixed
stage: backport needed -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset c8faabce6ef318f3b425c6defd846e274d61e2ef by Serhiy Storchaka in 
branch '3.5':
[3.5] Revert bpo-26293 for zipfile breakage. See also bpo-29094. (GH-1484). 
(#1486)
https://github.com/python/cpython/commit/c8faabce6ef318f3b425c6defd846e274d61e2ef


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset 70dc6a7a0b7f104d87512556fca242c2ca96a010 by Serhiy Storchaka in 
branch '3.6':
[3.6] Revert bpo-26293 for zipfile breakage. See also bpo-29094. (GH-1484). 
(#1485)
https://github.com/python/cpython/commit/70dc6a7a0b7f104d87512556fca242c2ca96a010


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +1589

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +1587

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset 3763ea865cee5bbabcce11cd577811135e0fc747 by Serhiy Storchaka in 
branch 'master':
Revert bpo-26293 for zipfile breakage. See also bpo-29094. (#1484)
https://github.com/python/cpython/commit/3763ea865cee5bbabcce11cd577811135e0fc747


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +1585

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-05 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
stage: commit review -> backport needed
versions:  -Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-05 Thread Benjamin Peterson

Benjamin Peterson added the comment:


New changeset ef4c6ba169ff59215442fc4047d83f7a3d39bf39 by Benjamin Peterson in 
branch '2.7':
Revert "Issue #29094: Offsets in a ZIP file created with extern file object and 
modes" (#1467)
https://github.com/python/cpython/commit/ef4c6ba169ff59215442fc4047d83f7a3d39bf39


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-05 Thread Benjamin Peterson

Changes by Benjamin Peterson :


--
pull_requests: +1566

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-04 Thread Thomas Wouters

Thomas Wouters added the comment:

Well, what should the zipfile module do when you open a broken zipfile for 
appending in the first place? :) There are many ways in which it could be 
broken. I don't think a zipfile with incorrect offsets should be treated any 
differently.

(I don't know *how* the zipfile module should treat broken zipfiles, and I 
don't know how it treats them now -- but I think the most important thing is 
consistency and documenting the choice :P)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thank you for raising this again and for your research Thomas. I now see that 
my understanding of the specification was wrong. I'll revert changes to zipfile.

There is one question. If the zip file was concatenated to other file (and 
looks incorrect from the view of other tools), and we opened it in append mode 
and added new entries, should the central directory count offsets from the 
start of the file or from the start of embedded archive? In the first case 
offsets of first entries (which was in the original concatenated archive) in 
the central directory will be different from offsets in local headers. In the 
second case the zip file will look incorrect from the view of other tools (but 
it already looked incorrect before appending).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-03 Thread Thomas Wouters

Thomas Wouters added the comment:

... and yes, the changes to issue 26293 should be reverted; the changes 
attached to this issue merely revert part of issue 26293's changes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-03 Thread Thomas Wouters

Thomas Wouters added the comment:

The spec isn't very explicit about it, yes, but it does say this:

4.4.16 relative offset of local header: (4 bytes)

   This is the offset from the start of the first disk on
   which this file appears, to where the local header should
   be found.

"the start of the first disk" could be construed to mean "the start of the ZIP 
archive embedded in this file". However, if you consider the information that's 
available, the only way to make ZIP archives work in the face of ZIP64 and 
other extensions that add information between the end-of-central-directory 
record and the end of the central directory, it's obvious that you can't 
correctly handle ZIP archives that start at an arbitrary point in a file.

ZIP archives have both a 4-byte magic number at the start, and a central 
directory at the end. The end-of-central-directory record is the very last 
thing in the file, and it records both the offset of the start of the central 
directory and the size of the central directory. In absense of any ZIP 
extensions that add records between the end-of-central-directory record and the 
end of the central directory, you can use those to correct all offsets in the 
ZIP archive. But as soon as you add (for example) ZIP64 records, this no longer 
works: ZIP64 has an end-of-zip64-central-directory locator, and variable-sized 
end-of-zip64-central-directory record. The locator is fixed size right before 
the end-of-central-directory record and records the offset (from the start of 
the file) to the end-of-zip64-central-directory record, but *not* the size of 
that record or any other information you can use to determine the offset of the 
start of the archive in the file.

Only by assuming the central directory record comes right before the 
end-of-central-directory record, or assuming fixed sizes for the ZIP64 record, 
can you deal with ZIP archives with offsets not from the start of the file. 
This assumption is not only *not* guaranteed by the ZIP spec, it's explicitly 
invalidated by ZIP64's variable sized records, and possibly other extensions 
(like encryption, compression and digital signatures, although I don't remember 
if those actually affect this).

It's true that many ZIP tools try to deal with these kinds of archives, 
although they *do* realise it's wrong and they usually *do* warn about it. They 
still can't deal with it if it uses variable-sized ZIP64 features (other than 
trawling through the file looking for the 4-byte magic numbers).

Here's an example of code that breaks because of this: 
https://github.com/Yhg1s/zipfile-hacks. I tried to convince zipfile to create 
Zip64 files with extra fields (the variable-sized parts) but unfortunately the 
*cough* "design" of the zipfile module doesn't allow that -- feel free to 
ignore the force_zip64 parts of the script.

(I'm using two python installations I had laying around here; I could've used 
2.7.12 vs 2.7.13 instead, and the results would be the same.)

# Python 2.7.12 -- so old behaviour
% python create_small_zip64.py -v --mode w --preamble '#!/usr/bin/python' 
py2-preamble-w.zip create_small_zip64.py
% python create_small_zip64.py -v --mode a --preamble '#!/usr/bin/python' 
py2-preamble-a.zip create_small_zip64.py

# Python 3.6.0+ -- after this change, so new behaviour
% ~/python/installs/py36-opt/bin/python3 create_small_zip64.py -v --mode w 
--preamble '#!/usr/bin/python' py3-preamble-w.zip create_small_zip64.py
% ~/python/installs/py36-opt/bin/python3 create_small_zip64.py -v --mode a 
--preamble '#!/usr/bin/python' py3-preamble-a.zip create_small_zip64.py

The old zipfiles are fine:
% zip -T py2-preamble-w.zip
test of py2-preamble-w.zip OK
% zip -T py2-preamble-a.zip
test of py2-preamble-a.zip OK

The new one using 'w' is also fine (as expected):
% zip -T py3-preamble-w.zip
test of py3-preamble-w.zip OK

The new one using 'a' is broken:
% zip -T py3-preamble-a.zip
warning [py3-preamble-a.zip]:  17 extra bytes at beginning or within zipfile
  (attempting to process anyway)
test of py3-preamble-a.zip FAILED

zip error: Zip file invalid, could not spawn unzip, or wrong unzip (original 
files unmodified)

The 'unzip' tool does work, but it also prints a warning:
% unzip -l py3-preamble-a.zip
Archive:  py3-preamble-a.zip
warning [py3-preamble-a.zip]:  17 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  Length  DateTimeName
-  -- -   
 4016  2017-05-03 14:23   create_small_zip64.py
- ---
 4016 1 file

Whether other tools try to compensate for the error depends greatly on the 
tool; there's quite a few that don't.

For the record, we had two different bits of code that created zipfiles with 
preambles using mode='a', created by (at least) two different people. I don't 
think it's unreasonable to assume that if you have a file with existing data 
you don't want the ZipFile to 

[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests:  -911

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

There is no explicit line, but offsets always are named relative: "relative 
offset of local header", "offset of start of central directory with respect to 
the starting disk number". This looks not pretty clear, but open source 
utilities (unzip, p7zip) support ZIP archives started not from the start of the 
file.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Anything breaking 2.7 should be reverted. I take it, the change from #26293 
should be as well as this one?

Yhg1s, the commit message for #26293 included the sentence "Offsets in ZIP file 
now are relative to the start of the archive in conforming to the 
specification." Can someone point to the line in the spec that covers this?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Larry Hastings

Larry Hastings added the comment:

What's to decide?  If the new behavior is also broken, we should fix it.  I'd 
like a fix in the next 3.5.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Gregory P. Smith

Gregory P. Smith added the comment:

release managers need to decide.

--
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Gregory P. Smith

Gregory P. Smith added the comment:

re-opening since, though released, this behavior change broke existing code.

--
nosy: +gregory.p.smith
resolution: fixed -> 
stage: resolved -> commit review
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-05-02 Thread Thomas Wouters

Thomas Wouters added the comment:

For the record, this new behaviour is wrong. It's not immediately obvious from 
the ZIP spec, but offsets have to be from the start of the file, not the 
archive, or ZIP64 can't work. And yes, that means you can't blindly concatenate 
ZIP files to other files, you have to rewrite them.

The way to create an 'embedded' ZIP archive like requested in issue #26293 is 
to write it to a separate file (or file-like object). Making "a" have this 
magical and wrong-for-almost-anyone meaning is... confusing, to say the least. 
The change certainly doesn't belong in a bugfix release, but I guess it's too 
late for that now. 

It's also not documented; the docs mention this:

  If mode is 'a' and file refers to an existing ZIP file, then
  additional files are added to it. If file does not refer to a ZIP
  file, then a new ZIP archive is appended to the file. This is
  meant for adding a ZIP archive to another file (such as
  python.exe).

which is ambiguous at best and certainly suggests the resulting file would be 
usable as-is, which is not the case if you use mode "a" (but *is* the case if 
you use mode "w", since 342bc734f523).

--
nosy: +twouters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-03-31 Thread Donald Stufft

Changes by Donald Stufft :


--
pull_requests: +911

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a80c14ace927 by Serhiy Storchaka in branch 'default':
Issue #29094: Offsets in a ZIP file created with extern file object and modes
https://hg.python.org/cpython/rev/a80c14ace927

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
priority: release blocker -> normal
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Larry Hastings

Larry Hastings added the comment:

If this is fixed, can we close this issue?  This release blocker is one of two 
issues blocking 3.5.3 rc1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Roundup Robot

Roundup Robot added the comment:

New changeset f5aa1c9c2b7e by Serhiy Storchaka in branch '3.5':
Issue #29094: Offsets in a ZIP file created with extern file object and modes
https://hg.python.org/cpython/rev/f5aa1c9c2b7e

New changeset 342bc734f523 by Serhiy Storchaka in branch '2.7':
Issue #29094: Offsets in a ZIP file created with extern file object and modes
https://hg.python.org/cpython/rev/342bc734f523

New changeset f36f9bce997d by Serhiy Storchaka in branch '3.6':
Issue #29094: Offsets in a ZIP file created with extern file object and modes
https://hg.python.org/cpython/rev/f36f9bce997d

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Actually all offsets are relative to some point, just this point not always at 
the start of the file. If concatenate the ZIP file to other file, the unzip 
utility and the zipfile module are able to infer the starting point and correct 
offsets. Thus there is no matter what is the starting point of offsets. But 
Go's standard archive/zip package works only if offsets are relative to the 
start of the file. I don't know how common such interpretation however.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2017-01-01 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Is there a reason to ever not use relative offsets? It seems that's strictly 
more general the absolute.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2016-12-31 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The specification of ZIP files is not pretty clear, but the unzip utility first 
try to interpret offsets relatively to the start of the archive that can be 
different from the start of the file. Go's standard archive/zip package 
interprets offsets as absolute file positions and doesn't support concatenated 
ZIP files.

In any case there is a regression. Proposed patch partially reverts issue26293. 
Offsets in created ZIP file are now absolute when create with modes 'w' and 
'x'. But they are relative when create with mode 'a'. This should fix a 
regression in pex, but keep the case of issue26293.

I'm going to push the patch before tagging 3.5.3 rc1.

--
keywords: +patch
nosy: +benjamin.peterson, larry, ned.deily
priority: normal -> release blocker
stage:  -> patch review
type:  -> behavior
versions: +Python 3.5, Python 3.6, Python 3.7
Added file: 
http://bugs.python.org/file46107/zipfile_write_absolute_offsets.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2016-12-28 Thread Benjamin Peterson

Changes by Benjamin Peterson :


--
assignee:  -> serhiy.storchaka
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29094] Regression in zipfile writing in 2.7.13

2016-12-28 Thread Peter Ebden

New submission from Peter Ebden:

In Python 2.7.13, using zipfile.ZipFile to write into a file with some initial 
preamble produces a zip file that cannot be read again by some zip 
implementations.
Our use case is using pex (https://github.com/pantsbuild/pex) which writes a 
zip that begins with a shebang, and later attempting to manipulate that using 
Go's standard archive/zip package. In 2.7.12 that works OK, but in 2.7.13 the 
.pex file is rejected on reading. Linux's command-line unzip tool will read the 
archive, but issues a warning ("4 extra bytes at beginning or within zipfile") 
which wasn't present previously. 
zipfile.ZipFile does read the files OK.

I assume this is related to https://bugs.python.org/issue26293 since that's the 
most obvious zipfile change in 2.7.13. It's pretty easy to reproduce using the 
example in that issue:

from zipfile import ZipFile
with open('a.zip', 'wb') as base:
base.write(b'old\n')
with ZipFile(base, 'a') as myzip:
myzip.write('eggs.txt')

unzip -t a.zip
Archive:  a.zip
warning [a.zip]:  4 extra bytes at beginning or within zipfile
  (attempting to process anyway)
...

--
components: Library (Lib)
messages: 284172
nosy: Peter Ebden
priority: normal
severity: normal
status: open
title: Regression in zipfile writing in 2.7.13
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com