Bugs item #1330039, was opened at 2005-10-18 13:27 Message generated for change (Comment added) made by nnorwitz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1330039&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Closed Resolution: Fixed Priority: 5 Submitted By: Martin Pitt (mpitt) Assigned to: Neal Norwitz (nnorwitz) Summary: tarfile.add() produces hard links instead of normal files Initial Comment: When opening a tarfile for writing and adding several files, some files end up being a hardlink to a previously added tar member instead of being a proper file member. I attach a demo that demonstrates the problem. It basically does: tarfile.open('tarfile-bug.tar', 'w') tar.add('tarfile-bug-f1') tar.add('tarfile-bug-f2') tar.close() in the resulting tar, "tarfile-bug-f2" is a hard link to tarfile-bug-f1, although both entries should be proper files. It works when the tarfile is close()d and opened again in append mode between the two add()s, but that slows down the process dramatically and is certainly not the intended way. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2005-10-20 09:29 Message: Logged In: YES user_id=33168 It will be fixed in 2.4.3 when released (that's the branch tags below, ie the second RCS rev number after each file). ---------------------------------------------------------------------- Comment By: Martin Pitt (mpitt) Date: 2005-10-20 07:45 Message: Logged In: YES user_id=80975 Confirmed, works perfectly now. Thank you very much! Will this also be fixed in a stable point release? Or just in 2.5? ---------------------------------------------------------------------- Comment By: Martin Pitt (mpitt) Date: 2005-10-20 07:38 Message: Logged In: YES user_id=80975 Thanks for the quick reply! Unfortunately, not removing the files after adding them to the tarfile is not really an option. I want to create a really huge tar file and put compressed files into it. For that purpose I create a temporary gzip file, put that into the tarfile, and remove the temporary file again. First, keeping track of all temp files would be cumbersome, and second it could quickly lead to disk space exhaustion. I'll try your patch now. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2005-10-19 21:59 Message: Logged In: YES user_id=33168 Martin, I have checked in Lars' patch. If this does not fix your problem, please re-open this bug report. Checked in as: * Lib/tarfile.py 1.34 and 1.21.2.6 * Lib/test/test_tarfile.py 1.20 and 1.16.2.2 ---------------------------------------------------------------------- Comment By: Lars Gustäbel (gustaebel) Date: 2005-10-19 05:41 Message: Logged In: YES user_id=642936 I just submitted patch #1331635 which ought to fix your problem. Thank you for your report. ---------------------------------------------------------------------- Comment By: Lars Gustäbel (gustaebel) Date: 2005-10-19 02:31 Message: Logged In: YES user_id=642936 This is a feature ;-) tarfile.py records the inode and device number (st_ino, st_dev) for each added file in a list (TarFile.inodes). When a new file is added and its inode and device number is found in this list, it will be added as a hardlink member, otherwise as a regular file. Because your test script adds and immediately removes each file, both files are assigned the same inode number. If you had another process creating a file in the meantime, the problem would not occur, because it would take over the inode number before the second file has the chance. Your problem shows that the way tarfile.py handles hardlinks is too sloppy. It must take the stat.st_nlink field into account. I will create a fix for this. As a workaround you have several options: - Do not remove the files after adding them, but after the TarFile is closed. - Set TarFile.dereference to False before adding files, so files with several links would always be added as regular files (see the Documentation). Disadvantage: symbolic links would be added as regular files as well. - Tamper with the source code. Edit TarFile.gettarinfo(). Change the line that says "if inode in self.inodes and not self.dereference:" to "if statres.st_nlink > 1 and inode in self.inodes and not self.dereference:". - Empy the TarFile.inodes list after each file. Ugh! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1330039&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com