On 26/06/2021 00:36, Harald van Dijk wrote:
Hi,

tar --exclude results in bad archives when hardlinks are used. Consider the following:

   $ mkdir tartest
   $ echo hello > tartest/a
   $ ln tartest/a tartest/b
   $ busybox tar cf - tartest | tar tvf -
   drwxr-xr-x harald/harald     0 2021-06-26 00:25 tartest/
   -rw-r--r-- harald/harald     6 2021-06-26 00:25 tartest/b
  hrw-r--r-- harald/harald     0 2021-06-26 00:25 tartest/a link to tartest/b

This is okay. tar may either pick up a first and then detect b as a hardlink to a, or pick up b first and then detect a as a hardlink to b. On my system, it picks up b first. You can adjust the below accordingly if on your system a is picked up first. Now, exclude b:

   $ busybox tar cf - --exclude=b tartest | tar tvf -
   drwxr-xr-x harald/harald     0 2021-06-26 00:25 tartest/
  hrw-r--r-- harald/harald     0 2021-06-26 00:25 tartest/a link to tartest/b

This resulted in an archive where the contents of tartest/a are missing. Extracting the archive results in an attempt to hardlink tartest/b, which may or may not exist in the target directory. GNU tar does not do this, it stores the contents of the file instead, which seems like a better idea to me. Can busybox be modified to do that as well?

Tested with busybox 1.33.1.

It seems like the fix is trivial, please see attached patch.

Cheers,
Harald van Dijk
From 5d0451656ace0c21454baf1ef65bed51c647df90 Mon Sep 17 00:00:00 2001
From: Harald van Dijk <[email protected]>
Date: Sun, 27 Jun 2021 15:11:57 +0100
Subject: [PATCH] tar: exclude files before updating hardlink info list

When excluding one file, and including another file that is a hardlink
of the excluded file, it should be stored as an ordinary file.
---
 archival/tar.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/archival/tar.c b/archival/tar.c
index 4a540b77a..1f257958f 100644
--- a/archival/tar.c
+++ b/archival/tar.c
@@ -507,6 +507,9 @@ static int FAST_FUNC writeFileToTarball(struct recursive_state *state,
 	if (header_name[0] == '\0')
 		return TRUE;
 
+	if (exclude_file(tbInfo->excludeList, header_name))
+		return SKIP;
+
 	/* It is against the rules to archive a socket */
 	if (S_ISSOCK(statbuf->st_mode)) {
 		bb_error_msg("%s: socket ignored", fileName);
@@ -540,9 +543,6 @@ static int FAST_FUNC writeFileToTarball(struct recursive_state *state,
 		return TRUE;
 	}
 
-	if (exclude_file(tbInfo->excludeList, header_name))
-		return SKIP;
-
 # if !ENABLE_FEATURE_TAR_GNU_EXTENSIONS
 	if (strlen(header_name) >= NAME_SIZE) {
 		bb_simple_error_msg("names longer than "NAME_SIZE_STR" chars not supported");
-- 
2.31.1

_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to