Meador Inge <mead...@gmail.com> added the comment: On Mon, May 14, 2012 at 12:31 PM, Serhiy Storchaka <rep...@bugs.python.org> wrote:
> Serhiy Storchaka <storch...@gmail.com> added the comment: > >> This is definitely *not* a padding issue. > > This is definitely a padding issue. All uncompressed files are located > so that the data starts with a 4-byte boundary (1190+30+15+1=1236, 27486 > +30+17+3=27536, etc). This is, probably, allows the use of mmap for the > resources. So? Someone may be using the extra fields to pad things, but for the purpose of this issue that is completely irrelevant. We only care about the proper structure of the file. Besides, without clear reference to source code or a specification any hypothesis of padding is hearsay. Did you look at the decoding I sent? The extra length field length is clearly reported as a size of one and the contents of the extra field are set to '\x00'. The extra field of size one is the actual problem, not padding. >> As Martin pointed out, the standard says that things must be in >> multiples of 4-bytes. > > More precisely, the extra field must have at least 4-bytes length to fit > a header. The standard is insufficiently defined in terms of what would > happen if the rest of the field is less than 4 bytes (this is hidden > behind by ellipsis). How is it insufficiently defined at all? It says [1]: In order to allow different programs and different types of information to be stored in the 'extra' field in .ZIP files, the following structure should be used for all programs storing data in this field: header1+data1 + header2+data2 . . . Each header should consist of: Header ID - 2 bytes Data Size - 2 bytes Note: all fields stored in Intel low-byte/high-byte order. The ellipsis is just a standard convention for indicating a repeating pattern. Extra fields which are not multiples of four bytes are not properly formed. >> So the record is non-portable. > > De jure the record is non-portable. De facto the record is portable > (many other tools supports it). But even if it does not portable, we are > dealing with the expansion of the zip format, which is very easy support > for reading. Like I said before, I am all for dropping extra fields we can not interpret. However, let us be clear that with respect to the standard we are implementing that zip files constructed like this are ill-formed. [1] http://www.pkware.com/documents/casestudies/APPNOTE.TXT ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14315> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com