New submission from Jacques Frechet <[EMAIL PROTECTED]>: The gzip header defined in RFC 1952 includes a mandatory "MTIME" field, originally intended to contain the modification time of the original uncompressed file. It is often ignored when decompressing, though gunzip (for example) uses it to set the modification time of the output file if applicable.
The Python gzip module always sets the MTIME field to the current time, and always discards MTIME when decompressing. As a result, compressing the same string using gzip produces different output every time. For certain applications, especially those involving comparisons or cryprographic signing of binary files, these spurious changes can be quite inconvenient. Aside from the MTIME field, the gzip module already produces entirely deterministic output. I'm attaching a patch which adds an optional "mtime" argument to the GzipFile class, giving the caller the option of providing a timestamp when compressing. Default behavior is unchanged. I've included updated documentation and three new test cases in the patch. In order to facilitate testing, the patch also includes code to set the "mtime" member of the GzipFile instance when decompressing. The first test case uses the new member to ensure that the timestamp given to the GzipFile constructor is preserved correctly. The second test checks for specific values in the entire gzip header (not just the MTIME field) by reading the compressed file directly, examining individual fields in a (relatively) flexible way. The third compares the entire compressed stream against a predetermined sequence of bytes in a relatively inflexible way. All tests pass on my AMD64 box, and I expect them all to pass on all supported platforms without any problems. However, If anybody is concerned that any of the tests sound like they might be too brittle, I'm certainly not overly attached to them. If anyone has any further suggestions, I'd be delighted to submit a new patch. Thanks! Jacques ---------- components: Library (Lib) files: gzip-mtime-py3k.patch keywords: patch messages: 75580 nosy: jfrechet severity: normal status: open title: set timestamp in gzip stream type: feature request Added file: http://bugs.python.org/file11954/gzip-mtime-py3k.patch _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4272> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com