Re: tar is creating corrupt archives when soft links are present

2022-12-03 Thread Paul Eggert

On 2022-12-01 15:53, Dominique Martinet wrote:


The fstatat64 structs returned from bin/awk and bin/bash are truncated,
could you provide the same strace with '-v' ?


Yes, I'd also like to see the output with strace -v. Assuming that looks 
good, I'd then like to see what GDB says about tar, when tar calls fstatat.



This depends on the libc but you need to build with large file support.
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 was it?


./configure should do that sort of thing automatically on a 32-bit build 
(which this one apparently is). On GNU/Linux there's no need for 
_LARGEFILE_SOURCE, but _FILE_OFFSET_BITS and _TIME_BITS should both be 
64 in recent-enough 32-bit x86 GNU/Linux. On older versions of 32-bit 
x86 GNU/Linux, _FILE_OFFSET_BITS will be 32 but _TIME_BITS will not be 
defined (and GNU tar won't work on files with timestamps after the year 
2038).



(the kernel has fstatat64 so it should be recent enough, but libc might
be too old, I didn't check since when these exist.)


One possibility is that tar was mis-built with its own _FILE_OFFSET_BITS 
or _TIME_BITS value disagreeing with that of some library that it's 
linked to. This would corrupt the data structure that 'tar' sees in its 
call to fstatat. If so, even if 'strace -v' reports correct results from 
fstatat64, 'tar' is seeing the wrong data.


On a Ubuntu 22.10 platform when tar is built with "./configure CC='gcc 
-m32'" so it's a 32-bit build, 'strace -v' works (and tar works), but 
this tar is running a new-enough kernel and C library that the fstatat 
in tar's source code turns into the new statx system call.


If I do the same build on a RHEL 7.9 system I see fstat64 syscalls and 
tar works fine.


This bug has the smell of perhaps running afoul of recent glibc changes 
to support 64-bit file timestamps on 32-bit x86. See, for example this 
October 2020 thread:


https://sourceware.org/pipermail/libc-alpha/2020-October/118623.html

If the headers in /usr/include don't match what's actually in the C 
library, or the C library was misconfigured when it was built (e.g., 
configured for a newer kernel than the actual one), then when tar calls 
fstatat it will get garbage data and will make mistakes based on that 
garbage. This seems the most likely hypothesis here.




Re: tar is creating corrupt archives when soft links are present

2022-12-03 Thread Martin Simmons
> On Thu, 1 Dec 2022 16:14:42 -0500, Timothe Litt said:
> 
> The hard link problem reproduces with this (note the two soft links turning 
> into a soft and a hard(!) - according to tar:
> 
> # ( cd / && ls -li bin/awk bin/bash && tar -cf - bin/awk bin/bash | tar -tvf 
> - )
> 22683669 lrwxrwxrwx 1 root root  4 Nov 28 08:45 bin/awk -> gawk
> 22683657 lrwxrwxrwx 1 root root 21 Nov 28 08:45 bin/bash -> 
> ../usr/local/bin/bash
> lrwxrwxrwx root/root 0 2022-11-29 14:37 bin/awk -> gawk
> hrwxrwxrwx root/root 0 2022-11-29 14:37 bin/bash link to bin/awk
> 
> Clearly, the bin/bash (a) is not a hard link on disk, and (b) does not link 
> to bin/awk.

The timestamps also differ.

Maybe your tar is miscompiled (e.g. mixing different versions of the
stat structure)?

If you can recompile it, then you could try adding the patch below to
make it print more info when creating the tar file.

--
--- src/create.c2021-02-04 14:00:33.0 +
+++ src/create.c2022-12-03 16:28:08.469456354 +
@@ -1471,6 +1471,11 @@
 static bool
 dump_hard_link (struct tar_stat_info *st)
 {
+  fprintf (stdlis, "%s %llu %llu %llu\n",
+  st->file_name,
+  (unsigned long long)st->stat.st_nlink,
+  (unsigned long long)st->stat.st_ino,
+  (unsigned long long)st->stat.st_dev);
   if (link_table
   && (trivial_link_count < st->stat.st_nlink || remove_files_option))
 {
--

Then run it with:

( cd / && stat bin/awk bin/bash && tar -cf - bin/awk bin/bash | tar -tvf - )

__Martin



Re: tar is creating corrupt archives when soft links are present

2022-12-01 Thread Dominique Martinet
Timothe Litt wrote on Thu, Dec 01, 2022 at 04:14:42PM -0500:
> The attached "hardlink_strace.txt" comes from a simplified command to reduce
> volume, but it should show the same syscalls:
> 
>  ( cd / && strace 2>hardlink_strace.txt tar -cf - bin/awk bin/bash
> >/dev/null )

The fstatat64 structs returned from bin/awk and bin/bash are truncated,
could you provide the same strace with '-v' ?

It's obvious the files are different as st_size is different and
corresponds to the ls, but it'd be better to make sure.

> Finally, an unrelated (except that it hit this incident and prevented an easy
> restore) issue: *tar skips some large files with*
>
> |tar: root/sd/sd.tar.gz: Cannot stat: Value too large for defined data type|
> |-rw-r--r-- 1 root root 32251081571 May 6 2007 /root/sd/sd.tar.gz|

This depends on the libc but you need to build with large file support.
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 was it?
(the kernel has fstatat64 so it should be recent enough, but libc might
be too old, I didn't check since when these exist.)

-- 
Dominique



Re: tar is creating corrupt archives when soft links are present

2022-12-01 Thread Timothe Litt
Thanks for the quick response.   I think I've cut this down even further 
based on your suggestion.  Here are the results, and some additional 
context.  Also, another problem with creating archives - very large 
files are skipped.


i've seen this on Fedora Core 4; the report is on FC 6.  (Yes, they're 
old.  But tar is new, built from source downloaded from ftp.gnu.org.)


The disk volume is a newly created (VirtualBox) vdi; 2 partitions,  
ext3, with the root mounted on hda2.  (boot is on hda1).


The file structure was initialized on a newer Linux machine and the 
archive extracted.  It's been a long few days, I don't remember if it 
was fc34 or debian...both were involved in putting things back together.


The original reproducer is cut down from about 130G (a 34G compressed 
archive).  There are "only" 107 files in /bin.


Here is the information from your suggestions.

The hard link problem reproduces with this (note the two soft links 
turning into a soft and a hard(!) - according to tar:


|# ( cd / && ls -li bin/awk bin/bash && tar -cf - bin/awk bin/bash | tar 
-tvf - )||

||22683669 lrwxrwxrwx 1 root root  4 Nov 28 08:45 bin/awk -> gawk||
||22683657 lrwxrwxrwx 1 root root 21 Nov 28 08:45 bin/bash -> 
../usr/local/bin/bash||

||lrwxrwxrwx root/root 0 2022-11-29 14:37 bin/awk -> gawk||
||hrwxrwxrwx root/root 0 2022-11-29 14:37 bin/bash link to bin/awk||
|

Clearly, the bin/bash (a) is not a hard link on disk, and (b) does not 
link to bin/awk.


The attached "hardlink_strace.txt" comes from a simplified command to 
reduce volume, but it should show the same syscalls:


 ( cd / && strace 2>hardlink_strace.txt tar -cf - bin/awk bin/bash 
>/dev/null )


A full ls -li is in full_ls.txt

In extract_from_tar_archive_showing_extent.txt is the first ~1900 lines 
of tar -tvf from an archive that merged all the soft links to "vi" when 
extracted to disk.  Note that the listing (a) shows the links as hard 
links (they were all soft on the original disk), and (b) shows the links 
as to "bin/ex", when in fact they were extracted as "vi".


To me, this all points to soft links being processed as if they were 
hard - mostly.


Going further with the toy example, we see that while tar reports the 
links as hard, they are extracted as soft, but with the wrong target for 
the second link.


|foo]# ( cd / && tar -cf - bin/awk bin/bash | tar -C /root/foo -xvf -  )||
||bin/awk||
||bin/bash||
||foo]# ls -li bin||!! This is bin extracted from the archive
||total 0||
||17418579 lrwxrwxrwx 2 root root 4 Dec  1 15:23 awk -> gawk||
||17418579 lrwxrwxrwx 2 root root 4 Dec  1 15:23 bash -> gawk||
||foo]# ls -li /bin/awk /bin/bash  || This is the bin that was archived||
||22683669 lrwxrwxrwx 1 root root  4 Nov 28 08:45 /bin/awk -> gawk||
||22683657 lrwxrwxrwx 1 root root 21 Nov 28 08:45 /bin/bash -> 
../usr/local/bin/bash||

|

To close the shell wildcard lead: if we now use (shell) wildcards, which 
pick up a couple of extra files), note that the bash link (to 
../usr/local...) is still extracted as a soft link to gawk.


Here's the modified test case:

|foo]# ( cd / && tar -cf - bin/aw* bin/bas* | tar -C /root/foo -xvf -  )||
||bin/awk||
||bin/basename||
||bin/bash||
||bin/bash.old||
||:foo]# ls -li bin||
||total 732||
||17418579 lrwxrwxrwx 2 root root  4 Dec  1 15:32 awk -> gawk||
||17418580 -rwxr-xr-x 1 root root  18484 Oct 31  2007 basename||
||17418579 lrwxrwxrwx 2 root root  4 Dec  1 15:32 bash -> gawk||
||17418581 -rwxr-xr-x 1 root root 722684 Jul 12  2006 bash.old||
|

An strace of the above in strace_wild.txt was obtained as shown below 
(the inode #s are different)


|foo]# ( cd / && ls -li bin/aw* bin/bas* && strace 
2>/root/strace_wild.txt tar -cf - bin/aw* bin/bas* >/dev/null  )||

||22683669 lrwxrwxrwx 1 root root  4 Nov 28 08:45 bin/awk -> gawk||
||22683748 -rwxr-xr-x 1 root root  18484 Oct 31  2007 bin/basename||
||22683657 lrwxrwxrwx 1 root root 21 Nov 28 08:45 bin/bash -> 
../usr/local/bin/bash||

||22683691 -rwxr-xr-x 1 root root 722684 Jul 12  2006 bin/bash.old||
||foo]# ls -li bin/||
||total 732||
||17418579 lrwxrwxrwx 2 root root  4 Dec  1 15:32 awk -> gawk||
||17418580 -rwxr-xr-x 1 root root  18484 Oct 31  2007 basename||
||17418579 lrwxrwxrwx 2 root root  4 Dec  1 15:32 bash -> gawk||
||17418581 -rwxr-xr-x 1 root root 722684 Jul 12  2006 bash.old||
|

Also, while l didn't keep the build directory for tar, I did keep the 
configure cache file, which may be helpful.


Not sure if I can recover what's left of the original disk; will try if 
necessary.  But I think this work has cut the problem down.


a) tar is confused about soft links.
b) it is reporting soft links as hard in -t output, but extracting them 
as soft
c) The extract uses the wrong target in the soft link - the target of 
the first soft link that it sees.


|# uname -a||
||Linux  2.6.22.14-100 #1 SMP Wed Apr 8 18:07:54 EDT 2015 i686 i686 i386 
GNU/Linux||

|

Finally, an unrelated (except that it hit