Bug#763119: [libtar] Bug#763119: misinterprets old-style GNU headers
On Oct 13, 2014, at 10:38 AM, Magnus Holmgren holmg...@debian.org wrote: The difference is that ustar is followed by two spaces, whereas in tar files created by libtar it's followed by a null character. The history behind this may help make it clearer: There has been a POSIX standard for the tar file format since 1996. It used to be part of the specification for the tar command-line program, but the file format is now part of the specification for the pax command-line program: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06 That standard specifies a 6-byte “magic” field containing “ustar\0” followed by a 2-byte version field containing the ASCII characters “00” (zero zero). These 8 bytes together are the canonical test for POSIX-compliant tar headers. GNU tar is derived from pdtar which predated the POSIX standard. Instead of a 6-byte field followed by a 2-byte field, pdtar used a single 8-byte field containing “ustar\x20\x20\0”. (I presume the author of pdtar got this from an early draft of the POSIX standard but I don’t know that for sure.) Checking these 8 bytes provides a good test for GNU tar format headers vs. POSIX-standard ustar format headers. Note that GNU tar can now generate POSIX-compliant ustar archives or pax extended format archives (with suitable options), so it’s important to distinguish the formats, not the programs. And yes, there are definitely plenty of tar programs that write tar archives that are not compliant with either of these (which is probably why that option exists, to suppress the format check for non-standard tar archives). I wrote a tar.5 man page to document my research into this: http://www.freebsd.org/cgi/man.cgi?query=tarsektion=5 It documents a lot of different tar variations and includes some discussion of how to distinguish them. Cheers, Tim P.S. strcmp() here is a very bad idea. Other tar files may not have null bytes where you expect them (indeed, someone might deliberately craft a tar file without null bytes in order to force a crash). You should actually use memcmp() for these tests since it will check exactly the bytes you expect. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#659294: [libarchive-discuss] Fwd: Bug#659294: libarchive: FTBFS on various architectures (hurd, mipsel, s390, s390x)
On Feb 21, 2012, at 3:40 AM, Pino Toscano wrote: Hi, (greetings from your favourite Hurd porter) Alle lunedì 13 febbraio 2012, Tim Kientzle ha scritto: So on hurd, I see a couple of interesting failures for bsdtar: [...] Actually, libarchive is pretty fine on Hurd, as it was after I fixed libarchove 3.0.2 (and in 3.0.3 there are no changes leading to issues). The problem is that the test suite run (just like the whole package build) is done within fakeroot (which means fakeroot-tcp), triggering Debian's #534879. Thanks, Pino. Libarchive's test suite does a lot of file operations, including a lot of cross-checks of file modes, ownership, and other properties. The races described in #534789 would likely manifest as essentially random failures in libarchive's test suite. Tim -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#659294: [libarchive-discuss] Fwd: Bug#659294: libarchive: FTBFS on various architectures (hurd, mipsel, s390, s390x)
On Feb 11, 2012, at 1:40 PM, Samuel Thibault wrote: Hello, Andres Mejia, le Fri 10 Feb 2012 16:34:40 -0500, a écrit : Hi. The new version of libarchive uploaded to unstable is failing the test suite (and thus failing to build the deb packages). We're going to need copies of the test directories from the test suites, e.g., Details for failing tests: /tmp/libarchive_test.2012-02-06T23.02.12-000 Please provide these test directories to libarchive-disc...@googlegroups.com. Here they are. Thank you! That helps. So on hurd, I see a couple of interesting failures for bsdtar: tar/test_copy: bsdtar is getting a nonsense file mode. This test creates a bunch of files and directories with varying permissions and file name lengths. One directory (ending in _194) is getting read as having a file type of 0, which is obviously nonsense. As a result, it's not getting archived. The test is recording two failures: 1) when bsdtar emits an error to stdout when trying to archive this directory and 2) when the directory doesn't appear in the restored copy This is especially confusing because it's not happening for any other files or directories in this test. An strace or truss of the process might clarify things. tar/test_option_H_upper and tar/test_option_L_upper: restoring incorrect permissions on a symlink to a directory The test archives and restores a number of files, directories, and symlinks to those files and directories. It looks like symlinks to directories are getting restored with different permissions than expected (expected 0755, seeing 0700). Does hurd handle symlinks to directories differently than other systems? Is the configuration script not finding lchmod() or lstat() correctly? Again, an strace or truss of the process might clarify things. You can run a single test by running the bsdtar_test program manually and specifying the name of the test: $ bsdtar_test -vvv -p /full/path/to/bsdtar -r /full/path/to/tar/test test_copy If you can strace or truss this (following children so we find out what bsdtar is doing as well), that would be appreciated. Thanks, Tim Kientzle -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#659294: [libarchive-discuss] Fwd: Bug#659294: libarchive: FTBFS on various architectures (hurd, mipsel, s390, s390x)
Each of these reports includes the name of the test directory, e.g., Details for failing tests: /tmp/libarchive_test.2012-02-06T23.02.12-000 Can we get the contents of those directories (which include detailed logs for each failure, the files involved, and other details)? Tim On Feb 9, 2012, at 4:20 PM, Andres Mejia wrote: There are some build failures on various architectures in Debian. Note that they're failures in the test suite. -- Forwarded message -- From: Julien Cristau jcris...@debian.org Date: Thu, Feb 9, 2012 at 5:52 PM Subject: Bug#659294: libarchive: FTBFS on various architectures (hurd, mipsel, s390, s390x) To: Debian Bug Tracking System sub...@bugs.debian.org Source: libarchive Version: 3.0.3-3 Severity: serious Justification: fails to build from source (but built successfully in the past) libarchive FTBFS on various buildds, with test failures: https://buildd.debian.org/status/package.php?p=libarchive mipsel: Totals: Tests run: 172 Tests failed: 1 Assertions checked:12407225 Assertions failed:3 Skips reported: 73 Failing tests: 60: test_read_disk_directory_traversals (3 failures) Details for failing tests: /tmp/libarchive_test.2012-02-06T23.02.12-000 FAIL: libarchive_test s390: Totals: Tests run: 172 Tests failed: 1 Assertions checked:12407234 Assertions failed:3 Skips reported: 73 Failing tests: 60: test_read_disk_directory_traversals (3 failures) Details for failing tests: /tmp/libarchive_test.2012-02-06T22.43.00-000 FAIL: libarchive_test s390x: Totals: Tests run: 31 Tests failed: 1 Assertions checked:7460 Assertions failed:2 Skips reported: 1 Failing tests: 13: test_option_b (2 failures) Details for failing tests: /tmp/bsdtar_test.2012-02-06T22.40.24-000 FAIL: bsdtar_test hurd-i386: Totals: Tests run: 31 Tests failed: 2 Assertions checked:7459 Assertions failed:3 Skips reported: 1 Failing tests: 7: test_option_H_upper (1 failures) 8: test_option_L_upper (2 failures) Details for failing tests: /tmp/bsdtar_test.2012-02-07T00.14.52-000 FAIL: bsdtar_test [...] Totals: Tests run: 28 Tests failed: 2 Assertions checked: 923 Assertions failed: 14 Skips reported: 1 Failing tests: 1: test_basic (13 failures) 26: test_passthrough_reverse (1 failures) Details for failing tests: /tmp/bsdcpio_test.2012-02-07T00.22.32-000 FAIL: bsdcpio_test Cheers, Julien -- ~ Andres -- You received this message because you are subscribed to the Google Groups libarchive-discuss group. To post to this group, send email to libarchive-disc...@googlegroups.com. To unsubscribe from this group, send email to libarchive-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/libarchive-discuss?hl=en. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#136231: [Bug-tar] Failure with --owner and --group when names cannot be mapped to IDs
On Aug 13, 2011, at 10:29 AM, Paul Eggert wrote: On 08/08/2011 03:28 AM, Thayne Harbaugh wrote: Attached is a patch that allows archives to be created with arbitrary owner or group names. Thanks for the bug report and patch; I was unaware of the problem. This runs into another area that I'd been meaning to enhance for some time: tar doesn't let you specify both user name and number (only one or the other), and similarly for groups. I wrote and installed the following patch into GNU tar, to address both enhancement requests simultaneously. FYI: bsdtar uses separate options: --uname and --uid for user name/id, --gname and --gid for group name/id. Tim -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#610783: [libarchive-discuss] Re: Bug#610783: bsdtar: Doesn't extract the install* and isolinux* directories of d-i images
Thomas, It will be a day or two before I can dig deeply into this. Unfortunately, it's been a while since I looked at that part of the code in detail, but I don't recall libarchive requiring all directories to precede all files and I recall a bunch of test cases over the last couple of years dealing with various kinds of empty content. So I suspect the situation is a little less dire than you think. ;-) Hmmm Are you working with libarchive 2.8.4? There have been a number of fixes in trunk specifically to handle symlinks and other empty data files; maybe some of those need to be backported. Tim CC: Michihiro, who has been doing a lot of work in this part of libarchive recently. On Jan 25, 2011, at 7:57 AM, Thomas Schmitt wrote: Hi, the situation now appears a bit better than first perceived in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=610783 The demand of libarchive that all directory entries have to come before any content block eases the task of producing digestible addresses for symbolic links, device files and empty data files. It suffices to let all these files point to an arbitrary block after the directory tree. In the general situation i would have to make them point to their neighbors. A much more ill situation, and also much more demanding towards the current libisofs architecture. If libarchive ever gives up the demand of directories-first, then it will have to compute own suitable keys for the affected file types which have no data content. Another problem would be hard links, which are quite common in ISO images. So my change proposal for libarchive appears more and more cumbersome. The simpler change in libisofs will probably allow me to make it suitable for unchanged libarchive by default. A dedicated block of 2048 zero bytes should avoid any ambiguity with non-empty data files. I am currently testing an implementation sketch which looks quite trustworthy. Another insight: The reason why bsdtar with genisoimage did not create two hard links to vmlinuz is in my Linux. It never shows hardlink siblings in mounted ISO images because it computes the inode number from the byte address of the directory entry. Two entries = two different inode numbers. So ISO images produced from /mnt contain two copies of vmlinuz. Have a nice day :) Thomas -- You received this message because you are subscribed to the Google Groups libarchive-discuss group. To post to this group, send email to libarchive-disc...@googlegroups.com. To unsubscribe from this group, send email to libarchive-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/libarchive-discuss?hl=en. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#546185: Username lookup failures with bsdtar --chroot
After some discussion, which you can read at: http://code.google.com/p/libarchive/issues/detail?id=69 we've decided not to do anything about this at this time. Basically, this seems like a glibc limitation; bsdtar asks glibc to do the lookup and apparently glibc needs more than just /etc/password and /etc/group. We're willing to consider arguments to the contrary; please feel free to add your comments to the bug linked to just above. Cheers, Tim Kientzle -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#546185: bsdtar: doesn't warn when library open after --chroot fails
I've filed a bug on libarchive.googlecode.com to track this issue upstream: http://code.google.com/p/libarchive/issues/detail?id=69 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#530301: libarchive on HURD
I've filed a bug upstream to track this: http://code.google.com/p/libarchive/issues/detail?id=68 The UF_NODUMP issue is fixed as of libarchive 2.8.0. I've sent a request to the original poster for clarification on the PATH_MAX issue. It's not clear from the original bug report whether HURD: a) Has no limit on the length of a path argument to system calls such as open(), stat(), etc. b) Has some other way to determine that limit. As soon as I get clarification on this issue, it will be quite easy to fix. Tim Kientzle libarchive author and maintainer -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#565474: [Bug-cpio] Re: Bug#565474: cpio makes device nodes into hard links when copying out of a cramfs image
Carl Miller wrote: On Sat, Jan 16, 2010 at 05:09:45AM +, Clint Adams wrote: You mean something like this? - (d-header.c_dev_min == min) ) + (d-header.c_dev_min == min) + ((d-header.c_mode CP_IFBLK) != CP_IFBLK) + ((d-header.c_mode CP_IFCHR) != CP_IFCHR) ) These tests should look like: (d-header.c_mode CP_IFMT) != CP_IFBLK Note the use of CP_IFMT to mask the file type (which is a four-bit field). Cheers, Tim -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#565474: [Bug-cpio] Re: Bug#565474: cpio makes device nodes into hard links when copying out of a cramfs image
On Fri, Jan 15, 2010 at 08:06:54PM -0800, Carl Miller wrote: cramfs takes a shortcut with device nodes, and assigns them all inode 1. I presume it also assigns nlinks == 1? When using cpio to copy files out of a cramfs image, cpio turns the second and all subsequent copied device nodes into hard links to the first copied out device node, based on them all having the same st_dev and st_ino. Another possible solution: When checking for hard links during copy-out, do not generate hardlink entries if nlinks 2. Tim -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#42158: a FreeBSD reference in disagreement with pax's behavior
My documentation for newc is based primarily on studying the implementation of GNU cpio. I've not found any good references for the history of this format. OK, this is good to know. I'm not saying one or the other program is wrong, but having a piece of documentation describing an implementation is of course not the same as a standard. POSIX considers cpio to be deprecated, so there's no chance that POSIX will ever formally standardize any cpio format variant other than the odc variant documented under pax. LSB documents this format since it's used by RPM. That's the only de jure standard I've found that discusses this particular cpio variant. Unfortunately, the LSB documentation for this format is pretty incomplete. It certainly doesn't discuss hardlink handling. The de facto standard for this format would be the implementation of cpio that originally shipped with SVr4. I don't know if SVr4 includes any documentation for the format apart from the implementation itself. I don't have access to SVr4 source code. Cheers, Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#42158: a FreeBSD reference in disagreement with pax's behavior
The discusison started with the OpenBSD pax implementation, which also does cpio. OpenBSD pax has the same roots as the FreeBSD one, so I suspect some of the problems are shared. This would be Keith Muller's old combined implementation of pax/cpio/tar. Here's the situation as I understand it: NetBSD and OpenBSD both use Keith Muller's old implementation for pax, cpio, and tar. I understand that both projects have done a lot of work on it over the years. FreeBSD's situation is in transition: * Uses my libarchive-based bsdtar implementation since FreeBSD 6.0. (Used GNU tar prior to that.) * Uses GNU cpio today, but might switch to my libarchive-based bsdpcio in FreeBSD 8.0 * Uses Keith Muller's pax implementation. (A libarchive-based pax is still a year or two out.) I should test this bug against the FreeBSD pax (another divergent tree based on Keith Muller's work). I think it would be good to compare to OpenSolaris cpio, being a third independent implementation of cpio. At the moment I do not have access to one, but I'll try to setup something today. Let us know what you find. Cheers, Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#42158: a FreeBSD reference in disagreement with pax's behavior
For Tim's reference: we're discussing pax here: http://bugs.debian.org/42158 I think it would be good to compare to OpenSolaris cpio, being a third independent implementation of cpio. At the moment I do not have access to one, but I'll try to setup something today. Oh, yeah. Gunnar Ritter's Heirloom toolchest (based on open-sourced ATT code) is also a good comparison point: http://heirloom.sourceforge.net/tools.html -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#42158: a FreeBSD reference in disagreement with pax's behavior
Tim Kientzle wrote: For Tim's reference: we're discussing pax here: http://bugs.debian.org/42158 I think it would be good to compare to OpenSolaris cpio, being a third independent implementation of cpio. At the moment I do not have access to one, but I'll try to setup something today. Oh, yeah. Gunnar Ritter's Heirloom toolchest (based on open-sourced ATT code) is also a good comparison point: http://heirloom.sourceforge.net/tools.html From Gunnar Ritter's cpio.1 manpage: The -c format was introduced with System V Release 4. Except for the file size, it imposes no practical limitations on files archived. The original SVR4 implementation stores the contents of hard linked files only once and with the last archived link. This cpio ensures compatibility with SVR4. With archives created by implementations that employ other methods for storing hard linked files, each file is extracted as a single link, and some of these files may be empty. I'm not sure what exactly this last sentence is supposed to mean. Tim -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#42158: a FreeBSD reference in disagreement with pax's behavior
My documentation for newc is based primarily on studying the implementation of GNU cpio. I've not found any good references for the history of this format. I'm a little unclear what pax implementation you're discussing. Based on the description below, I would suggest you test whether this program duplicates bodies for each hardlink it stores. This is easy to test: Make two hardlinks to the same large file, archive them and see if the resulting archive is twice as big as the file. The odc (POSIX-1988) format should duplicate bodies for hardlinks. GNU cpio's implementation of newc format does not. Tar formats (including the POSIX-2001 pax extended format) do not as a rule, though the pax extended format does permit it as an option. My sympathies for the maintainers of the pax you're discussing; it is surprisingly difficult to correctly handle all three common approaches for hardlink management within a single program. Tim Kientzle Daniel Kahn Gillmor wrote: Tim Kientzle of FreeBSD (author of libarchive, attempting to CC here) describes the cpio format here: http://people.freebsd.org/~kientzle/libarchive/man/cpio.5.txt This document states about the SRV4 (newc) format (magic 070701, which is what we're dealing with): In this format, hardlinked files are handled by setting the filesize to zero for each entry except the last one that appears in the archive. So this is interpretation is shared by at least GNU and FreeBSD, afaict. pax appears to be in disagreement with these systems as far as its creation of SRV4/newc archives goes, since it stores a non-zero filesize for each entry of a hardlinked file. It's in dangerous disagreement with GNU and FreeBSD during the unpacking stage, because it re-creates hardlinked files as 0 bytes in length if it encounters archives created by the other utilities. Hope this is a useful reference, --dkg For Tim's reference: we're discussing pax here: http://bugs.debian.org/42158 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#494169: [Fwd: FW: Bug#494169: libarchive-dev: Please add a way to precompute (non-compressed) archive size]
Thibaut, John Goerzen forwarded your idea to me. You can actually implement this on top of the current libarchive code quite efficiently. Use the low-level archive_write_open() call and provide your own callbacks that just count the write requests. Then go through and write the archive as usual, except skip the write_data() part (for tar and cpio formats, libarchive will automatically pad the entry with NUL bytes). This may sound slow, but it's really not. One of the libarchive unit tests use this approach to write 1TB archives in just a couple of seconds. (Thist test checks libarchive's handling of very large archives with very large entries.) Look at test_tar_large.c for the details of how this particular test works. (test_tar_large.c actually does more than just count the data, but it should give you the general idea.) This will work very well with all of the tar and cpio formats. It won't work well with some other formats where the length does actually depend on the data. Cheers, Tim Kientzle Original Message Date: Thu, 7 Aug 2008 21:31:27 -0500 From: John Goerzen [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: FW: Bug#494169: libarchive-dev: Please add a way to precompute (non-compressed) archive size Hi Tim, We received the below feature request at Debian. Not sure if it is something you would be interested in implementing, but thought I'd pass it along. -- John - Forwarded message from Thibaut VARENE [EMAIL PROTECTED] - From: Thibaut VARENE [EMAIL PROTECTED] Date: Thu, 07 Aug 2008 17:37:10 +0200 Reply-To: Thibaut VARENE [EMAIL PROTECTED], [EMAIL PROTECTED] To: Debian Bug Tracking System [EMAIL PROTECTED] Subject: Bug#494169: libarchive-dev: Please add a way to precompute (non-compressed) archive size Package: libarchive-dev Severity: wishlist Hi, I thought I already reported this, but apparently I didn't so here's the idea: I'm the author of mod_musicindex, in which I use libarchive to send on-the-fly tar archives to remote clients. Right now, the remote client's browser cannot display any ETA / %complete for the current download since I cannot tell before hand what will be the exact size of the archive I'm sending them. It would be very nice if there were some API allowing for the precomputation of the final size of a non-compressed archive that would allow me to do something like: archive_size = archive_size_header(a); for (filename in file list) { archive_size += archive_size_addfile(filename); /* or using stat() and eg archive_size_addstat() */ } archive_size += archive_size_footer(a); (brainfart pseudo code, I hope you get the idea) so that in the end archive_size will be exactly the size of the output archive (header/padding included), without having to actually read files or write the archive itself. I could thus send the remote client the actual size of the data they're going to be send beforehand. The trick is, this size cannot be approximate: the browser will cut the transfer even if I'm still sending them data if it has received as many bits as it was told. I'm under the impression that since this is about non-compressed archive, and considering the structure of a tar archive, my goal should be feasible without even having to read any input file. Am I wrong? Hope I'm quite clear, thanks for your help T-Bone -- System Information: Debian Release: lenny/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: hppa (parisc64) Kernel: Linux 2.6.22.14 (SMP w/4 CPU cores) Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) Shell: /bin/sh linked to /bin/bash - End forwarded message - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#494169: [Fwd: FW: Bug#494169: libarchive-dev: Please add a way to precompute (non-compressed) archive size]
Thibaut VARENE wrote: On Fri, Aug 8, 2008 at 8:42 AM, Tim Kientzle [EMAIL PROTECTED] wrote: Thibaut, John Goerzen forwarded your idea to me. You can actually implement this on top of the current libarchive code quite efficiently. Use the low-level archive_write_open() call and provide your own callbacks that just count the write requests. Then go through and write the archive as usual, except skip the write_data() part (for tar and cpio formats, libarchive will automatically pad the entry with NUL bytes). Hum, I'm not quite sure I get this right... By count the write requests and skip the write_data() part, you mean count the number of bytes that should have been written, without writting them? Yes. This may sound slow, but it's really not. One of the libarchive unit tests use this approach to write 1TB archives in just a couple of seconds. (Thist test checks libarchive's handling of very large archives with very large entries.) Look at test_tar_large.c for the details of how this particular test works. (test_tar_large.c actually does more than just count the data, but it should give you the general idea.) I will have to look into that code indeed. If I get this right tho, you're basically suggesting that I read the input files twice: once without writing the data, and the second time writing the data? No. I'm suggesting you use three passes: 1) Get the information for all of the files, create archive_entry objects. 2) Create a fake archive using the technique above. You don't need to read the file data here! After you call archive_write_close(), you'll know the size of the complete archive. (This is really just your original idea.) 3) Write the real archive as usual, including reading the actual file data and writing it to the archive. Arguably the second read would come from the VFS cache, but that's only assuming the server isn't too busy serving hundreds of other files, which is why I'm a bit concerned about optimality... My limited understanding of the tar format made me believe that it was possible to know the space taken by a given file in a tar archive just by looking at its size and adding the necessary padding bytes. Was I wrong? You could make this work. If you're using plain ustar (no tar extensions!), then each file has the data padded to a multiple of 512 bytes and there is a 512 byte header for each file. Then you need to round the total result up the a multiple of the block size. (Default is 10240 bytes, you probably should set the block size to 512 bytes.) For reference, here's the (relatively short) code I use: http://www.parisc-linux.org/~varenet/musicindex/doc/html/output-tarball_8c-source.html This will work very well with all of the tar and cpio formats. It won't work well with some other formats where the length does actually depend on the data. Yep, that was quite clear indeed ;) Thanks for your input! -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#474400: Testsuite failure of bsdtar
Bernhard R. Link wrote: My guess is that the order readdir returns them implies which file is stored as regular file and which is stored as hard link. Usually the f_ file is stored as regular file and the l_ file as hardlink. But on my filesystem the l_ file is stored as regular file and the f_ file as hardlink. Of course, you're absolutely right. There are different length limits for the source path and the target and I'm testing right up to those limits, so for this test it does actually matter which one gets stored in which way. I think I see an easy way to restructure this test to avoid this problem on all platforms. I'll get that fix into 2.5.4. Thank you for your patience. I'll let you know as soon as I have a candidate fix. Cheers, Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#474400: Testsuite failure of bsdtar
Good guess, but I don't think that explains anything, since the order in which hardlinked files get stored doesn't matter. (There are a few tests that do detailed format verification; those would be affected by the order in which files get stored, but none of those depend on the ordering of readdir().) The reference to 92-characters in that test refers to the length of the filename not including the directory portion. (Details: original/ is 9 characters, the link length limit for ustar is 100 characters, and the test code only verifies the final path element, hence should never see anything over 91 chars; I should probably put some more detailed comments around that part of the test code.) Have you had a chance to try the libarchive 2.5.3b package? I've fixed a subtle issue with handling almost-too-long filenames in ustar format (which doesn't appear to explain the problems you're having) and also reworked a couple of the tests to give more information. Maybe that would shed additional light on this problem. I've also just committed a change to the test_format_newc code that allows for a 1-second slop in that test, which should eliminate the occasional failure you mentioned. Thanks for reporting that. Cheers, Tim Bernhard R. Link wrote: I think I found the problem: # tar -tvvf ustar/archive 21 | grep 'original/f_abcdefghijklmnopqrstuvwxyzabcdefghijkl ' says: hrw-r--r-- brl/brl 0 2008-05-11 11:41 original/f_abcdefghijklmnopqrstuvwxyzabcdefghijkl link to original/l_abcdefghijklmnopqrstuvwxyzabcdefghijkl This is the file with 92 characters, which is (as far as I understand test_copy.c:111) not expressable as link. I guess the reason for this is that depending on the filesystem options readdir returns the prior created files in some random order, so sometimes l_* gets returned before f_* (judging from the Debian buildds, on Linux actually more often than not). And thus the test-case fails. Hochachtungsvoll, Bernhard R. Link By the way, I think there is also a race condition in cpio/test/test_format_new.c in test_format_newc. In one run one of the assertEqualInt(t, from_hex(e + 46, 8)); failed for me with two number differing by one. (I guess some the second stepped just as the wrong moment). Dunno if that is important enough to fix, as seldom as it seems to happen. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#474400: Fwd: Bug#474400: libarchive build failures
John Goerzen wrote: Here's some more info on libarchive for you. Hope this helps. Paul Cannon wrote: Running tests on: /home/paul/packages/libarchive-2.4.17/bsdtar 0: test_basic tar/test/test_basic.c:53: Assertion failed: Ints not equal r=256 0=0 Description: Error invoking /home/paul/packages/libarchive-2.4.17/bsdtar xf archive This is the basic copy test, which simply invokes bsdtar to archive a file, a directory, a symlink, and a hardlink and then invokes bsdtar again to restore the created archive to a different directory. For some reason, bsdtar is returning an error when dearchiving. The first question I have is whether this is because the created archive is corrupt or whether the restore process failed. These tests are all run in a directory /tmp/bsdtar_test_X/test_basic; the files named below are all relative to this: * 'filelist' is a list of the files to be archived * the created archive is in 'copy/archive' * stdout/stderr from archiving should be in 'copy/pack.out' and 'copy/pack.err' * stdout/stderr from dearchiving should be in 'copy/unpack.out' and 'copy/unpack.err' * The restored files should be in the 'copy/' dir You should be able to manually try unpacking the archive with: $ cd /tmp/bsdtar_test_X/test_basic $ mkdir mytest $ cd mytest $ /home/paul/packages/libarchive-2.4.17/bsdtar xf ../copy/archive If the archive itself seems correct, then an strace of bsdtar while extracting might be very illuminating. If this doesn't shed any light, send me the contents of /tmp/bsdtar_test_X/test_basic for one of the failed tests. Maybe there are some other clues in there. Cheers, Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419793: libarchive-dev: archive_write_data seems to ignore wrappers in some circumstances
Bernhard R. Link wrote: There might be ways to help libraries to still link against libarchive without having to change their off_t to 64 bit, but using off64_t in those cases. Actually, I believe I've laid the groundwork to make the 64-bit/32-bit issue completely transparent to software using libarchive. It requires someone knowledgable about this area of Linux to help me fill in the few remaining pieces, but should not require any deep knowledge of libarchive internals. The most important piece is to create three versions of archive_entry_stat() and archive_entry_copy_stat() (which copy data between a platform-native struct stat and libarchive's internal storage): * archive_entry_stat32 and archive_entry_copy_stat32, which deal with struct stat32 * Similar archive_entry_stat64 and archive_entry_copy_stat64 * archive_entry_stat() and archive_entry_copy_stat() would be defined twice: In code as synonyms for the 64-bit versions (to preserve ABI compat for programs using the shared libraries) and as macros that map to 32-bit or 64-bit versions depending on the off_t size being used by the program. There are a couple of entry points defined in archive.h that use off_t directly, but those are rarely used and so are less critical. Once the above is working, similar techniques should apply to them. Getting all of the configuration right so this builds correctly on platforms that don't have two different off_t/struct stat definitions is probably the trickiest part. This is a low priority for me right now, though I do believe that I've factored the internals well enough to make this a feasible project for someone who knows nothing about libarchive. If anyone is interested, let me know. Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419793: libarchive-dev: archive_write_data seems to ignore wrappers in some circumstances
Note the use of __xstat64/fopen64 for the working code instead of __xstat/fopen for the non-working code. At this point I believed it had something to do with the use of 64bit offsets (USE_FILE_OFFSET64) but simply adding this define to the compiler commandline didn't fix the issue. I'm not familiar with USE_FILE_OFFSET64, I thought the correct incantation was this: gcc -D_FILE_OFFSET_BITS=64 Have you tried this? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419793: libarchive-dev: archive_write_data seems to ignore wrappers in some circumstances
Thibaut VARENE wrote: I'm the author and maintainer of libapache-mod-musicindex. Recently a bug has been reported to me, by which the tarball download implemented in mod-musicindex wouldn't work properly with apache2, while it did work with apache1.3 Note the use of __xstat64/fopen64 for the working code instead of __xstat/fopen for the non-working code. Linux has two struct stat definitions, and code compiled with one cannot be used with the other. Libarchive always compiles with the 64-bit version so it can correctly handle very large archives. If your code is compiled to use the 32-bit version, it won't work. Apparently, httpd.h is somehow forcing your code to use the 32-bit stat, which is incompatible with libarchive. Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419793: libarchive-dev: archive_write_data seems to ignore wrappers in some circumstances
Thibaut VARENE wrote: bip 8192 Do I win? bip 8192 Do I win? bip 8192 (note, with libarchive 2, 'bip' (the output of archive_write) is 0, which seems a bit more coherent). Yes, libarchive 1 does return wrong values from archive_write. As you observed, this bug is fixed in libarchive 2. Tim Kientzle -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]