[issue24465] Make tarfile have deterministic sorting
Sam Thursfield added the comment: Here's a patch which does the same thing but only for shutil.make_archive(). Note that the final output will still be non-deterministic if you use format=gztar because time.time() and the base_name argument get added to the gzip header. Might be nice to add an option to make that deterministic too, as a separate thing. This patch is useful to me as-is though. -- Added file: http://bugs.python.org/file39770/make_archive-stable-ordering.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24465] Make tarfile have deterministic sorting
Sam Thursfield added the comment: I've discovered that this patch introduces a nasty failure case! If you have a relative symlink pointing to a directory that's alphabetically sorted after the symlink, and files inside the symlink, 'tar -x' won't be able to create those files because the symlink target won't exist yet. I'll rework this to only affect shutil.make_archive(), and to avoid hitting this bug. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24465] Make tarfile have deterministic sorting
Sam Thursfield added the comment: Having tested, the problem I described above doesn't happen with this patch. It's a mistake in some other code I wrote which is following symlinks when it should not do. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24465] Make tar files created by shutil.make_archive() have deterministic sorting
Changes by Sam Thursfield sam.thursfi...@codethink.co.uk: -- keywords: +patch Added file: http://bugs.python.org/file39728/tarfile-stable-ordering.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24465] Make tar files created by shutil.make_archive() have deterministic sorting
New submission from Sam Thursfield: I want shutil.make_archive() to produce deterministic output when given identical data as inputs. Right now there are two holes in this. One is that mtimes might not match. This can be fixed by the caller. The second is that the order that files in a subdirectory get added to the tarfile is not deterministic. This can't be fixed by the caller. Attached is a trivial patch to sort the results of os.listdir() to ensure the output tarfile is stable. This only applies to the 'tar' format. I've attached my testcase for this, which creates 3 tarfiles in /tmp. When this patch is applied, the 3 tarfiles it creates are identical according to `sha1sum`. Without this patch, they are all different. -- components: Library (Lib) files: tar-reproducible-testcase.py messages: 245464 nosy: samthursfield priority: normal severity: normal status: open title: Make tar files created by shutil.make_archive() have deterministic sorting type: enhancement Added file: http://bugs.python.org/file39727/tar-reproducible-testcase.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24465] Make tarfile have deterministic sorting
Sam Thursfield added the comment: Thanks for the comments! Would you be happy for the patch to be merged if it was implemented by modifying shutil.make_archive() instead? I will rework it if so. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24465 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16477] tarfile fails to close file handles in case of exception
New submission from Sam Thursfield: Exceptions such as disk full during extraction cause tarfile to leak file handles. Besides being messy, it causes real problems if, for example, the target file is on a mount that should be unmounted before the program exits - in this case, the unmount will fail as there are still open file handles. Simplest solution I can see is to change: def makefile(self, tarinfo, targetpath): Make a file called targetpath. source = self.extractfile(tarinfo) target = bltn_open(targetpath, wb) copyfileobj(source, target) source.close() target.close() to this: def makefile(self, tarinfo, targetpath): Make a file called targetpath. source = self.extractfile(tarinfo) try: with open(targetpath, wb) as target: shutil.copyfileobj(source, target) finally: source.close() -- components: Library (Lib) messages: 175616 nosy: ssam priority: normal severity: normal status: open title: tarfile fails to close file handles in case of exception type: behavior versions: Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16477 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16477] tarfile fails to close file handles in case of exception
Sam Thursfield added the comment: sorry, replace 'open' with 'bltn_open' in the above comment - no need to change it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16477 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com