On 8/13/25 14:51, Collin Funk wrote:
the addition of many stat calls isn't ideal

Yes, we should do better than that.

I drafted something that avoids most stat calls, fixes the obscure security vulnerabilities in my earlier proposal, and prevents the vulnerability when users don't follow the current instructions in the manual; see attached. However, it's not idiot-proof enough yet, in that it requires people to use the same options on each extraction. Also, the code is pretty complicated. So although I'm attaching a revised patch, I am not installing it yet and I plan to think further to do something that is better along those lines.

Here is the diff from the nist.gov page [1].

[1] https://github.com/ip7z/7zip/compare/25.00...25.01

Unfortunately that doesn't help us much, as it doesn't really say what 7-zip's vulnerability was, or why that change fixed it, or whether 7-zip has further vulnerabilities (all too likely in this area).
From 0254f53936f10286ecee9d75c3b6990f735d2fdd Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Wed, 13 Aug 2025 18:47:45 -0700
Subject: [PATCH] =?UTF-8?q?Don=E2=80=99t=20extract=20suspicious=20links=20?=
 =?UTF-8?q?as=20is?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Also, new option --absolute-links to restore old behavior.
CVE-2025-45582 reported by Lior Kaplan in:
https://lists.gnu.org/r/bug-tar/2025-08/msg00000.html
Also see:
https://nvd.nist.gov/vuln/detail/CVE-2025-45582
https://github.com/i900008/vulndb/blob/main/Gnu_tar_vuln.md
* src/create.c (dump_hard_link, file_count_links):
* src/extract.c (extract_link, extract_symlink):
Use --absolute-links, not --absolute-names, to decide
whether to worry about security of link contents.
* src/extract.c (extract_link): Treat a hard link to an absolute
file name like a hard link to a name containing "..".
* src/extract.c (extract_link, extract_symlink):
If the extraction directory is root, do not worry
about security of link contents, since the archive
can extract anywhere anyway.
(suspicious_dot_dot_parent, sanitize_link): New functions.
(apply_delayed_link): New args BUF and BUFSIZE.  Caller changed.
Sanitize links before using them.
* src/list.c (decode_xform): Do not worry about leading / here.
* src/misc.c: Include same-inode.h.
(struct wd): New member is_root.
(grow_wd, chdir_arg): Initialize it.
(chdir_is_root): New function.
* src/tar.c (absolute_links_option): New static var.
(ABSOLUTE_LINKS_OPTION): New constant.
(options, parse_opt): Support --absolute-links.
* tests/extrac07.at (extracting symlinks to a read-only dir):
Do not rely on symlinks escaping the extraction dir.
* tests/extrac22.at (delay-directory-restore on reversed ordering):
Use --absolute-links, since the test requires escaping the
extraction directory.
* tests/extrac31.at: New file.
* tests/Makefile.am (TESTSUITE_AT):
* tests/testsuite.at: Add it.

chdir_id refactoring
This prepares for future changes that need directory IDs.
* src/common.h (struct chdir_id): New struct.
* src/extract.c (extract_dir): Use chdir_id to avoid duplicate stats.
* src/misc.c (struct wd): New member ID.
(grow_wd): New function, extracted from chdir_arg and that
also initializes id.err.
(chdir_arg): Use it.  Initialize id.err.
(chdir_id): New function.
---
 NEWS               |  18 +++-
 doc/tar.texi       | 133 ++++++++++++++----------
 src/common.h       |   2 +
 src/create.c       |   4 +-
 src/extract.c      | 247 +++++++++++++++++++++++++++++++++++++++++----
 src/list.c         |   4 +-
 src/misc.c         |  26 +++++
 src/tar.c          |   9 +-
 tests/Makefile.am  |   1 +
 tests/extrac07.at  |  16 ++-
 tests/extrac22.at  |   8 +-
 tests/extrac31.at  |  63 ++++++++++++
 tests/testsuite.at |   1 +
 13 files changed, 440 insertions(+), 92 deletions(-)
 create mode 100644 tests/extrac31.at

diff --git a/NEWS b/NEWS
index 9a10b8b8..bdc8e63a 100644
--- a/NEWS
+++ b/NEWS
@@ -1,9 +1,21 @@
-GNU tar NEWS - User visible changes. 2025-07-26
+GNU tar NEWS - User visible changes. 2025-08-11
 Please send GNU tar bug reports to <bug-tar@gnu.org>
 
 version TBD
 
-* New manual section "Reproducibility", for reproducible tarballs.
+* By default, tar no longer extracts suspicious links as-is.
+
+These link to locations outside the extraction directory.
+They can be hard or symbolic links.  Instead of extracting them,
+tar now diagnoses the issue and creates safer replacement symbolic
+links to targets that start with '_' instead of '/', and that have
+'_.'  instead of '..'.
+
+* New option: --absolute-links
+
+This re-enables the traditional, unsafe behavior of extracting
+suspicious links as-is.  This new option is implied by
+--absolute-names (-P).
 
 * New options: --set-mtime-command and --set-mtime-format
 
@@ -33,6 +45,8 @@ empty string that file or member is skipped and a warning is printed.
 The warning can be suppressed using the --warning=empty-transform
 option.
 
+* New manual section "Reproducibility", for reproducible tarballs.
+
 * Bug fixes
 
 ** Fixed O(n^2) time complexity bug for large numbers of directories when
diff --git a/doc/tar.texi b/doc/tar.texi
index 2fe2c45c..b4a8dcae 100644
--- a/doc/tar.texi
+++ b/doc/tar.texi
@@ -1938,8 +1938,9 @@ prior to the execution of the @command{tar} command.
 working directory.  @command{tar} will make all file names relative
 (by removing leading slashes when archiving or restoring files),
 unless you specify otherwise (using the @option{--absolute-names}
-option).  @xref{absolute}, for more information about
-@option{--absolute-names}.
+option).  Also, @command{tar} treats links to absolute names specially
+unless you specify @option{--absolute-links} or
+@option{--absolute-names}.  @xref{absolute}.
 
 If you give the name of a directory as either a file name or a member
 name, then @command{tar} acts recursively on all the files and directories
@@ -2447,6 +2448,14 @@ exist in the archive. @xref{update}.
 
 @table @option
 
+@opsummary{absolute-links}
+@item --absolute-links
+
+Normally when extracting from an archive, @command{tar} treats links
+specially if they link to names that start with @samp{/} or contain @samp{..}.
+This option disables that behavior.  This option is implied by
+@option{--absolute-names}.  @xref{absolute}.
+
 @opsummary{absolute-names}
 @item --absolute-names
 @itemx -P
@@ -2454,7 +2463,8 @@ exist in the archive. @xref{update}.
 Normally when creating an archive, @command{tar} strips an initial
 @samp{/} from member names, and when extracting from an archive @command{tar}
 treats names specially if they have initial @samp{/} or internal
-@samp{..}.  This option disables that behavior.  @xref{absolute}.
+@samp{..}.  This option disables that behavior.  Also, this option
+implies @option{--absolute-links}.  @xref{absolute}.
 
 @opsummary{acls}
 @item --acls
@@ -9527,10 +9537,17 @@ The interpretation of options in file lists is disabled by
 @cindex file names, absolute
 
 By default, @GNUTAR{} drops a leading @samp{/} on
-input or output, and complains about file names containing a @file{..}
-component.  There is an option that turns off this behavior:
+input or output and complains about file names containing a
+@file{..}@: component.  Also, when extracting it treats links
+specially if they start with @samp{/} or contain @file{..}.
+Two options turn off this behavior:
 
 @table @option
+@opindex absolute-links
+@item --absolute-links
+When extracting, do not sanitize links to names that start with
+@samp{/} or contain a @file{..}@: component.
+
 @opindex absolute-names
 @item --absolute-names
 @itemx -P
@@ -9568,49 +9585,50 @@ for the information on how to handle this case.}.
 Symbolic links containing @file{..} or leading @samp{/} can also cause
 problems when extracting, so @command{tar} normally extracts them last;
 it may create empty files as placeholders during extraction.
+Also, @command{tar} ordinarily refuses to extract a hard or symbolic link to a
+file name that starts with @samp{/} or contains a @file{..}@:
+if the link might point outside the extraction directory,
+as these links are suspicious and can be used to attack your system.
+Instead, @command{tar} diagnoses the issue and extracts a replacement
+symbolic link that contains @samp{_} instead of leading @samp{/}
+and @samp{_.}@: instead of @samp{..}.
+This default behavior lets you use @command{tar} to extract multiple
+untrusted archives into the same newly created directory,
+so long as all the extractions are consistent about whether they use
+@option{--keep-newer-files}, @option{--keep-old-files} (@option{-k}),
+@option{--skip-old-files}, @option{--overwrite},
+@option{--overwrite-dir}, @option{--no-overwrite-dir},
+@option{--recursive-unlink}, and @option{--unlink-first} (@option{-U}).
+
+If you use @option{--absolute-links}, @command{tar} extracts
+suspicious links as-is.  This option lets @command{tar} create links
+that might cause a later program (including @command{tar} itself) to
+modify files outside the extraction directory.  It is unsafe to use
+@option{--absolute-links} when extracting from an untrusted archive if
+a later extraction from an untrusted archive will be done to the same
+directory.
 
 If you use the @option{--absolute-names} (@option{-P}) option,
-@command{tar} will do none of these transformations.
-
-To archive or extract files relative to the root directory, specify
-the @option{--absolute-names} (@option{-P}) option.
-
-Normally, @command{tar} acts on files relative to the working
-directory---ignoring superior directory names when archiving, and
-ignoring leading slashes when extracting.
-
-When you specify @option{--absolute-names} (@option{-P}),
-@command{tar} stores file names including all superior directory
-names, and preserves leading slashes.  If you only invoked
-@command{tar} from the root directory you would never need the
+@command{tar} does not transform file names and extracts suspicious links as-is.
+This option implies @option{--absolute-links},
+and is unsafe when extracting from untrusted archives.
+It lets you archive or extract files relative to the root directory.
+If you invoked
+@command{tar} only from the root directory you would never need the
 @option{--absolute-names} option, but using this option
 may be more convenient than switching to root.
 
 @FIXME{Should be an example in the tutorial/wizardry section using this
 to transfer files between systems.}
 
-@table @option
-@item --absolute-names
-Preserves full file names (including superior directory names) when
-archiving and extracting files.
-
-@end table
-
+When you specify @option{--absolute-names} (@option{-P}),
 @command{tar} prints out a message about removing the @samp{/} from
 file names.  This message appears once per @GNUTAR{}
 invocation.  It represents something which ought to be told; ignoring
 what it means can cause very serious surprises, later.
 
-Some people, nevertheless, do not want to see this message.  Wanting to
-play really dangerously, one may of course redirect @command{tar} standard
-error to the sink.  For example, under @command{sh}:
-
-@smallexample
-$ @kbd{tar -c -f archive.tar /home 2> /dev/null}
-@end smallexample
-
-@noindent
-Another solution, both nicer and simpler, would be to change to
+Some people, nevertheless, do not want to see this message.
+To suppress it, change to
 the @file{/} directory first, and then avoid absolute notation.
 For example:
 
@@ -9619,7 +9637,7 @@ $ @kbd{tar -c -f archive.tar -C / home}
 @end smallexample
 
 @xref{Integrity}, for some of the security-related implications
-of using this option.
+of using these options.
 
 @include parse-datetime.texi
 
@@ -13132,17 +13150,21 @@ under the working directory.  If the working directory contains a
 symbolic link to another directory, the untrusted user can also write
 into any file under the referenced directory.  When extracting from an
 untrusted archive, it is therefore good practice to create an empty
-directory and run @command{tar} in that directory.
-
-When extracting from two or more untrusted archives, each one should
-be extracted independently, into different empty directories.
-Otherwise, the first archive could create a symbolic link into an area
-outside the working directory, and the second one could follow the
-link and overwrite data that is not under the working directory.  For
-example, when restoring from a series of incremental dumps, the
-archives should have been created by a trusted process, as otherwise
-the incremental restores might alter data outside the working
-directory.
+directory and run @command{tar} in that directory.  Do not use
+the @option{--absolute-names} (@option{-P}) option, as that would
+let the untrusted archive write anywhere.
+
+When extracting from two or more untrusted archives, you can start
+with an empty directory and extract from them one at a time,
+using the same @command{tar} options each time.
+Later archives will ordinarily override earlier ones.
+Do not use the @option{--absolute-links} or
+@option{--absolute-names} (@option{-P}) options,
+and be consistent about whether you use other options
+like @option{--keep-old-files} and @option{--unlink-first};
+otherwise, one of the archives could create a symbolic link into an area
+outside the working directory, and a later archive could follow the
+link and overwrite data that is not under the working directory.
 
 If you use the @option{--absolute-names} (@option{-P}) option when
 extracting, @command{tar} respects any file names in the archive, even
@@ -13151,16 +13173,15 @@ lets the archive overwrite any file in your system that you can write,
 the @option{--absolute-names} (@option{-P}) option should be used only
 for trusted archives.
 
-Conversely, with the @option{--keep-old-files} (@option{-k}) and
+Some options may help when extracting from untrusted archives.
+With the @option{--keep-old-files} (@option{-k}) and
 @option{--skip-old-files} options, @command{tar} refuses to replace
 existing files when extracting.  The difference between the two
 options is that the former treats existing files as errors whereas the
 latter just silently ignores them.
-
-Finally, with the @option{--no-overwrite-dir} option, @command{tar}
+With the @option{--no-overwrite-dir} option, @command{tar}
 refuses to replace the permissions or ownership of already-existing
-directories.  These options may help when extracting from untrusted
-archives.
+directories.
 
 @node Live untrusted data
 @subsection Dealing with Live Untrusted Data
@@ -13230,7 +13251,10 @@ $ @kbd{tar -xvf /archives/got-it-off-the-net.tar.gz}
 @end group
 @end example
 
-As a corollary, do not do an incremental restore from an untrusted archive.
+@item
+Do not do an incremental restore from an untrusted archive
+unless you use the same options for all @command{tar} invocations,
+starting from an empty directory.
 
 @item
 Do not let untrusted users access files extracted from untrusted
@@ -13250,7 +13274,8 @@ When archiving live file systems, monitor running instances of
 @command{tar} to detect denial-of-service attacks.
 
 @item
-Avoid unusual options such as @option{--absolute-names} (@option{-P}),
+Avoid unusual options such as @option{--absolute-links},
+@option{--absolute-names} (@option{-P}),
 @option{--dereference} (@option{-h}), @option{--overwrite},
 @option{--recursive-unlink}, and @option{--remove-files} unless you
 understand their security implications.
diff --git a/src/common.h b/src/common.h
index d72f3ec8..815472be 100644
--- a/src/common.h
+++ b/src/common.h
@@ -102,6 +102,7 @@ extern enum archive_format archive_format;
 extern idx_t blocking_factor;
 extern idx_t record_size;
 
+extern bool absolute_links_option;
 extern bool absolute_names_option;
 
 /* Display file times in UTC */
@@ -757,6 +758,7 @@ extern int chdir_fd;
 idx_t chdir_arg (char const *dir);
 void chdir_do (idx_t dir);
 struct chdir_id { int err; dev_t st_dev; ino_t st_ino; } chdir_id (void);
+bool chdir_is_root (void);
 idx_t chdir_count (void);
 
 void close_diag (char const *name);
diff --git a/src/create.c b/src/create.c
index 078c77e1..05f594fa 100644
--- a/src/create.c
+++ b/src/create.c
@@ -1438,7 +1438,7 @@ dump_hard_link (struct tar_stat_info *st)
 	{
 	  /* We found a link.  */
 	  char const *link_name = safer_name_suffix (duplicate->name, true,
-	                                             absolute_names_option);
+	                                             absolute_links_option);
 	  if (duplicate->nlink)
 	    duplicate->nlink--;
 
@@ -1478,7 +1478,7 @@ file_count_links (struct tar_stat_info *st)
       struct link *lp;
 
       assign_string (&linkname, safer_name_suffix (st->orig_file_name, true,
-						   absolute_names_option));
+						   absolute_links_option));
       if (!transform_name (&linkname, XFORM_LINK))
 	{
 	  free (linkname);
diff --git a/src/extract.c b/src/extract.c
index 3b2d0d5d..e162f0d5 100644
--- a/src/extract.c
+++ b/src/extract.c
@@ -1530,7 +1530,9 @@ extract_link (char *file_name, MAYBE_UNUSED char typeflag)
 
   link_name = current_stat_info.link_name;
 
-  if ((! absolute_names_option && 0 <= first_dot_dot (link_name))
+  if ((!absolute_links_option && !chdir_is_root ()
+       && (IS_ABSOLUTE_FILE_NAME (link_name)
+	   || 0 <= first_dot_dot (link_name)))
       || find_delayed_link_source (link_name))
     return create_placeholder_file (file_name, false, &interdir_made);
 
@@ -1591,13 +1593,14 @@ static bool
 extract_symlink (char *file_name, MAYBE_UNUSED char typeflag)
 {
   bool interdir_made = false;
+  char const *link_name = current_stat_info.link_name;
 
-  if (! absolute_names_option
-      && (IS_ABSOLUTE_FILE_NAME (current_stat_info.link_name)
-	  || 0 <= first_dot_dot (current_stat_info.link_name)))
+  if (!absolute_links_option && !chdir_is_root ()
+      && (IS_ABSOLUTE_FILE_NAME (link_name)
+	  || 0 <= first_dot_dot (link_name)))
     return create_placeholder_file (file_name, true, &interdir_made);
 
-  while (symlinkat (current_stat_info.link_name, chdir_fd, file_name) < 0)
+  while (symlinkat (link_name, chdir_fd, file_name) < 0)
     switch (maybe_recoverable (file_name, false, &interdir_made))
       {
       case RECOVER_OK:
@@ -1619,7 +1622,7 @@ extract_symlink (char *file_name, MAYBE_UNUSED char typeflag)
 	      }
 	    return extract_link (file_name, typeflag);
 	  }
-	symlink_error (current_stat_info.link_name, file_name);
+	symlink_error (link_name, file_name);
 	return false;
       }
 
@@ -1877,9 +1880,201 @@ extract_archive (void)
     undo_last_backup ();
 }
 
-/* Extract the link DS whose final extraction was delayed.  */
+/* Return true if NAME/.. is suspicious with respect to the extraction
+   directory EXTDIRID, as the ".." might escape from that directory.  */
+
+static bool
+suspicious_dot_dot_parent (char const *name, struct chdir_id extdirid)
+{
+  /* If the extraction directory's ID is unknown, assume the worst.  */
+  if (extdirid.err)
+    return true;
+
+  struct stat st;
+
+  /* If tar does not remove files, NAME is troublesome if its status is unknown
+     (e.g., it does not exist), or it is the extraction directory.  */
+  if (old_files_option == KEEP_OLD_FILES || old_files_option == SKIP_OLD_FILES)
+    return fstatat (chdir_fd, name, &st, 0) < 0 || SAME_INODE (extdirid, st);
+
+  /* Tar removes files, so NAME is troublesome unless it is known to
+     be a nonempty directory that is not the extraction directory.
+     Do not worry about --recursive-unlink or symlinks in NAME's components,
+     as the caller dealt with that.  */
+
+  int fd = openat (chdir_fd, name,
+		   O_RDONLY | O_DIRECTORY | O_BINARY | O_CLOEXEC);
+  if (fd < 0)
+    return true;
+  bool trouble = true;
+  DIR *dirp = (fstat (fd, &st) < 0 || SAME_INODE (extdirid, st)
+	       ? NULL
+	       : fdopendir (fd));
+  if (!dirp)
+    close (fd);
+  else
+    {
+      for (struct dirent *d; trouble && (d = readdir (dirp)); )
+	trouble = d->d_name[0] == '.' && !d->d_name[1 + (d->d_name[1] == '.')];
+      closedir (dirp);
+    }
+  return trouble;
+}
+
+/* Sanitize the proposed link TARGET from SOURCE
+   if the resulting link might escape the extraction directory.
+   Assume the extraction directory contains no escaping symlinks.
+   If sanitization occurs diagnose it, using IS_SYMLINK to tailor wording.
+   Use *ABUF, of size *BUFSIZE, for temporary heap-allocated storage.
+   Perhaps modify TARGET's contents, but restore them before returning.
+   Be conservative: sanitize if it is unknown whether the link escapes.
+   Return the sanitized name, which is either TARGET or in *ABUF.  */
+
+static char const *
+sanitize_link (char *target, char const *source, bool is_symlink,
+	       char **abuf, idx_t *bufsize)
+{
+  /* No link can escape the root directory.  */
+  if (chdir_is_root ())
+    return target;
+
+  struct chdir_id extdirid = chdir_id ();
+  bool suspicious_target = IS_ABSOLUTE_FILE_NAME (target);
+
+  /* If the target is absolute and its prefix names the extraction
+     directory without using symlinks, the length of that prefix;
+     otherwise zero.  */
+  idx_t extdirlen = 0;
+
+  /* An absolute target is safe if its prefix is the extraction directory
+     and the corresponding suffix lacks "..".  */
+  if (suspicious_target && !extdirid.err)
+    for (idx_t i = FILE_SYSTEM_PREFIX_LEN (target); target[i]; i++)
+      if (FILE_SYSTEM_PREFIX_LEN (target) < i
+	  && !ISSLASH (target[i - 1])
+	  && (ISSLASH (target[i]) || !target[i]))
+	{
+	  /* Get the file status of this target prefix.
+	     Do not follow symlinks, as they might go within
+	     the extraction subtree which is vulnerable.
+	     The fstatat call is equivalent to lstat (target, &st)
+	     but avoids the hassle of dynamically linking lstat.  */
+	  char target_i = target[i];
+	  target[i] = '\0';
+	  struct stat st;
+	  bool ok = 0 <= fstatat (AT_FDCWD, target, &st, AT_SYMLINK_NOFOLLOW);
+	  target[i] = target_i;
+
+	  if (!ok)
+	    {
+	      /* A link to a nonexistent target does not escape.  */
+	      if (errno == ENOENT || errno == ENOTDIR)
+		return target;
+
+	      /* A link to an otherwise-erroneous target might be trouble.
+		 For example, ELOOP might occur here but not in a kernel
+		 with a higher tolerance for long symlink chains.  */
+	      break;
+	    }
+
+	  /* A symlink might go within the extraction tree, which
+	     might be trouble.  Don't try to resolve the symlink by hand,
+	     as this might run afoul of ELOOP or ENAMETOOLONG issues.  */
+	  if (S_ISLNK (st.st_mode))
+	    break;
+
+	  /* A non-symlink non-directory cannot act as a directory.  */
+	  if (!S_ISDIR (st.st_mode))
+	    return target;
+
+	  if (SAME_INODE (st, extdirid))
+	    {
+	      /* The first I bytes of the target are an absolute name
+		 of the extraction directory, and this name contains
+		 no symlinks.  */
+	      extdirlen = i;
+	      suspicious_target = false;
+	      break;
+	    }
+	}
+
+  /* Build the name of the target, which is absolute if EXTDIRLEN is nonzero,
+     and is relative to the extraction directory otherwise.  */
+  idx_t slen = extdirlen ? 0 : last_component (source) - source,
+    tlen = strlen (target), size = slen + tlen + 1;
+  if (*bufsize < size)
+    {
+      free (*abuf);
+      *abuf = xpalloc (NULL, bufsize, size - *bufsize, -1, 1);
+    }
+  char *name = *abuf;
+  memcpy (mempcpy (name, source, slen), target, tlen + 1);
+  if (suspicious_target)
+    name[slen] = '_';
+
+  bool ancestor_symlink = false;
+
+  /* If NAME contains a suspicious ".." turn it and any later ".."s into "_.".
+     A ".." is suspicious if its parent is the extraction directory, or
+     is a directory entry that tar might later replace with a symlink
+     to the extraction directory, or if any file name components in
+     the path from the extraction directory to the parent are symlinks.  */
+  for (ptrdiff_t i = extdirlen; name[i]; i++)
+    {
+      bool at_dot2 = (name[i] == '.'
+		      && (ISSLASH (name[i + 1]) || !name[i + 1])
+		      && extdirlen <= i - 1
+		      && name[i - 1] == '.'
+		      && (extdirlen == i - 1 || ISSLASH (name[i - 2])));
+
+      if (!ancestor_symlink && !ISSLASH (name[i]) && ISSLASH (name[i + 1])
+	  && ! (at_dot2 || (name[i] == '.'
+			    && (extdirlen == i || ISSLASH (name[i - 1])))))
+	{
+	  char name_i1 = name[i + 1];
+	  name[i + 1] = '\0';
+	  char linkbuf[1];
+	  ancestor_symlink = 0 <= readlinkat (chdir_fd, name, linkbuf, 1);
+	  name[i + 1] = name_i1;
+	}
+
+      if (at_dot2)
+	{
+	  name[i] = '\0';
+	  bool suspicious_dot_dot
+	    = (suspicious_target
+	       || (old_files_option == UNLINK_FIRST_OLD_FILES
+		   && recursive_unlink_option)
+	       || (old_files_option != KEEP_OLD_FILES
+		   && old_files_option != SKIP_OLD_FILES
+		   && ancestor_symlink)
+	       || suspicious_dot_dot_parent (name, extdirid));
+	  name[i] = '.';
+
+	  if (suspicious_dot_dot)
+	    {
+	      name[i - 1] = '_';
+	      suspicious_target = true;
+	    }
+	}
+    }
+
+  if (!suspicious_target)
+    return target;
+
+  char *sanitized_target = name + slen;
+  paxerror (0, _(is_symlink
+		 ? "%s: symlinking to %s instead of to %s"
+		 : "%s: symlinking to %s instead of hard linking to %s"),
+	    quotearg_colon (source),
+	    quote_n (1, sanitized_target), quote_n (2, target));
+  return sanitized_target;
+}
+
+/* Extract the link DS whose final extraction was delayed.
+   Use *BUF, of size *BUFSIZE, for temporary heap-allocated storage.  */
 static void
-apply_delayed_link (struct delayed_link *ds)
+apply_delayed_link (struct delayed_link *ds, char **buf, idx_t *bufsize)
 {
   struct string_list *sources = ds->sources;
   char const *valid_source = 0;
@@ -1900,19 +2095,29 @@ apply_delayed_link (struct delayed_link *ds)
 	{
 	  /* Unlink the placeholder, then create a hard link if possible,
 	     a symbolic link otherwise.  */
+
 	  if (unlinkat (chdir_fd, source, 0) < 0)
-	    unlink_error (source);
-	  else if (valid_source
-		   && (linkat (chdir_fd, valid_source, chdir_fd, source, 0)
-		       == 0))
-	    ;
-	  else if (!ds->is_symlink)
 	    {
-	      if (linkat (chdir_fd, ds->target, chdir_fd, source, 0) < 0)
-		link_error (ds->target, source);
+	      unlink_error (source);
+	      continue;
+	    }
+
+	  if (valid_source
+	      && 0 <= linkat (chdir_fd, valid_source, chdir_fd, source, 0))
+	    continue;
+
+	  char const *target
+	    = (absolute_links_option
+	       ? ds->target
+	       : sanitize_link (ds->target, source, ds->is_symlink,
+				buf, bufsize));
+	  if (!ds->is_symlink)
+	    {
+	      if (linkat (chdir_fd, target, chdir_fd, source, 0) < 0)
+		link_error (target, source);
 	    }
-	  else if (symlinkat (ds->target, chdir_fd, source) < 0)
-	    symlink_error (ds->target, source);
+	  else if (symlinkat (target, chdir_fd, source) < 0)
+	    symlink_error (target, source);
 	  else
 	    {
 	      struct tar_stat_info st1;
@@ -1954,8 +2159,11 @@ apply_delayed_link (struct delayed_link *ds)
 static void
 apply_delayed_links (void)
 {
+  char *buf = NULL;
+  idx_t bufsize = 0;
+
   for (struct delayed_link *ds = delayed_link_head; ds; ds = ds->next)
-    apply_delayed_link (ds);
+    apply_delayed_link (ds, &buf, &bufsize);
 
   if (false && delayed_link_table)
     {
@@ -1963,6 +2171,7 @@ apply_delayed_links (void)
 	 and freeing is more likely to cause than cure trouble.
 	 Also, the above code has not bothered to free the list
 	 in delayed_link_head.  */
+      free (buf);
       hash_free (delayed_link_table);
       delayed_link_table = NULL;
     }
diff --git a/src/list.c b/src/list.c
index 822c12dd..ce611efd 100644
--- a/src/list.c
+++ b/src/list.c
@@ -89,7 +89,9 @@ decode_xform (char const *file_name, int type)
       return file_name;
 
     case XFORM_LINK:
-      file_name = safer_name_suffix (file_name, true, absolute_names_option);
+      /* Do not worry about leading '/' or internal '..', as
+	 sanitize_link handles that later.  */
+      file_name = safer_name_suffix (file_name, true, true);
       break;
 
     case XFORM_REGFILE:
diff --git a/src/misc.c b/src/misc.c
index 94c4e603..20ceac67 100644
--- a/src/misc.c
+++ b/src/misc.c
@@ -21,6 +21,7 @@
 #include "common.h"
 #include <c-ctype.h>
 #include <quotearg.h>
+#include <same-inode.h>
 #include <xgetcwd.h>
 #include <unlinkdir.h>
 #include <utimens.h>
@@ -914,6 +915,11 @@ struct wd
      to be used.  */
   int fd;
 
+  /* If 1, the directory is a root;
+     if 0, it is not a root or the root test failed;
+     if -1, the root test has not been done yet.  */
+  signed char is_root;
+
   /* If ID.err is zero, the directory's identity;
      if positive, a failure indication with errno = ID.err;
      if negative, no attempt has been made yet to get the identity.  */
@@ -959,6 +965,7 @@ grow_wd (void)
       wd[wd_count].name = ".";
       wd[wd_count].abspath = NULL;
       wd[wd_count].fd = AT_FDCWD;
+      wd[wd_count].is_root = -1;
       wd[wd_count].id.err = -1;
       wd_count++;
     }
@@ -986,6 +993,7 @@ chdir_arg (char const *dir)
   wd[wd_count].name = dir;
   wd[wd_count].abspath = NULL;
   wd[wd_count].fd = 0;
+  wd[wd_count].is_root = -1;
   wd[wd_count].id.err = -1;
   return wd_count++;
 }
@@ -1076,6 +1084,24 @@ chdir_id (void)
     }
   return curr->id;
 }
+
+/* Return true if it is known that the current directory is a root,
+   i.e., that it is its own parent.  */
+bool
+chdir_is_root (void)
+{
+  struct chdir_id id = chdir_id ();
+  struct wd *curr = &wd[chdir_current];
+  if (curr->is_root < 0)
+    {
+      struct stat st;
+      curr->is_root = (!id.err && 0 <= fstatat (chdir_fd, "..", &st, 0)
+		       && SAME_INODE (st, curr->id));
+    }
+
+  assume (0 <= curr->is_root && curr->is_root <= 1);
+  return curr->is_root;
+}
 
 const char *
 tar_dirname (void)
diff --git a/src/tar.c b/src/tar.c
index c58a19fa..452418f2 100644
--- a/src/tar.c
+++ b/src/tar.c
@@ -36,6 +36,7 @@ enum subcommand subcommand_option;
 enum archive_format archive_format;
 idx_t blocking_factor;
 idx_t record_size;
+bool absolute_links_option;
 bool absolute_names_option;
 bool utc_option;
 bool full_time_option;
@@ -352,7 +353,8 @@ tar_set_quoting_style (char *arg)
 
 enum
 {
-  ACLS_OPTION = CHAR_MAX + 1,
+  ABSOLUTE_LINKS_OPTION = CHAR_MAX + 1,
+  ACLS_OPTION,
   ATIME_PRESERVE_OPTION,
   BACKUP_OPTION,
   CHECK_DEVICE_OPTION,
@@ -827,6 +829,8 @@ static struct argp_option options[] = {
    N_("stay in local file system when creating archive"), GRID_FILE },
   {"absolute-names", 'P', 0, 0,
    N_("don't strip leading '/'s from file names"), GRID_FILE },
+  {"absolute-links", ABSOLUTE_LINKS_OPTION, 0, 0,
+   N_("extract absolute links instead of omitting"), GRID_FILE },
   {"dereference", 'h', 0, 0,
    N_("follow symlinks; archive and dump the files they point to"),
    GRID_FILE },
@@ -1737,6 +1741,9 @@ parse_opt (int key, char *arg, struct argp_state *state)
     case 'P':
       optloc_save (OC_ABSOLUTE_NAMES, args->loc);
       absolute_names_option = true;
+      FALLTHROUGH;
+    case ABSOLUTE_LINKS_OPTION:
+      absolute_links_option = true;
       break;
 
     case 'r':
diff --git a/tests/Makefile.am b/tests/Makefile.am
index cd879361..51937955 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -140,6 +140,7 @@ TESTSUITE_AT = \
  extrac28.at\
  extrac29.at\
  extrac30.at\
+ extrac31.at\
  filerem01.at\
  filerem02.at\
  grow.at\
diff --git a/tests/extrac07.at b/tests/extrac07.at
index b5fd3d17..cefc82c4 100644
--- a/tests/extrac07.at
+++ b/tests/extrac07.at
@@ -31,18 +31,14 @@ AT_TAR_CHECK([
 AT_UNPRIVILEGED_PREREQ
 
 echo Prepare the directory
-mkdir dir
-genfile -f foo
-cd dir
-ln -s ../foo .
-cd ..
-chmod a-w dir
+mkdir -p dir/rodir
+genfile -f dir/rodir/foo
+ln -s rodir dir/rodirlink
+chmod a-w dir dir/rodir
 
 echo Create the archive
 tar cf archive dir || exit 1
 
-chmod +w dir
-
 echo Extract
 mkdir out
 tar -C out -xvf archive
@@ -52,7 +48,9 @@ tar -C out -xvf archive
 Create the archive
 Extract
 dir/
-dir/foo
+dir/rodir/
+dir/rodir/foo
+dir/rodirlink
 ],
 [],[],[],[ustar]) # Testing one format is enough
 
diff --git a/tests/extrac22.at b/tests/extrac22.at
index 623470e4..0d194dd8 100644
--- a/tests/extrac22.at
+++ b/tests/extrac22.at
@@ -18,10 +18,10 @@
 
 AT_SETUP([delay-directory-restore on reversed ordering])
 
-# The --delay-directory-resore option worked incorrectly on archives with
+# The --delay-directory-restore option worked incorrectly on archives with
 # reversed member ordering (which was documented, anyway). This is illustrated
 # in
-#   http://lists.gnu.org/archive/html/bug-tar/2019-03/msg00022.html
+#   https://lists.gnu.org/archive/html/bug-tar/2019-03/msg00022.html
 # which was taken as a base for this testcase.
 # The bug affected tar versions <= 1.32.
 
@@ -51,10 +51,10 @@ AT_DATA([filelist],
 tar -C t -c -f a.tar --no-recursion -T filelist
 
 mkdir restore
-tar -x -p --delay-directory-restore -C restore -f a.tar
+tar -x -p --delay-directory-restore --absolute-links -C restore -f a.tar
 # Previous versions of tar would fail here with the following diagnostics:
 # tar: ./dir2/data2: Cannot unlink: Permission denied
 ],
 [0],
 [])
-AT_CLEANUP
\ No newline at end of file
+AT_CLEANUP
diff --git a/tests/extrac31.at b/tests/extrac31.at
new file mode 100644
index 00000000..492aa44f
--- /dev/null
+++ b/tests/extrac31.at
@@ -0,0 +1,63 @@
+# Test suite for GNU tar.                             -*- Autotest -*-
+# Copyright 2025 Free Software Foundation, Inc.
+#
+# This file is part of GNU tar.
+#
+# GNU tar is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# GNU tar is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+AT_SETUP([extracting untrusted incremental])
+AT_KEYWORDS([extract extrac31 --absolute-links])
+
+
+AT_TAR_CHECK([
+
+# Extraction should not escape the extraction directory
+# even when extracting multiple times to the same directory.
+(umask 022 && mkdir -p dira/sub dirb/sym dirb/sub/sym ext victimdir)
+ln -s .. dira/sub/dotdot
+ln -s ../sub dira/sub/dot
+ln -s dotdot/sub dira/sub/anotherdot
+ln -s ../victimdir dira/sym
+ln -s dotdot/../victimdir dira/sub/sym
+echo b1 >dirb/sym/file1
+echo b2 >dirb/sub/sym/file2
+echo v >victimdir/expected
+echo v >victimdir/file1
+echo v >victimdir/file2
+tar -cf a.tar -C dira sub sym
+tar -cf b.tar -C dirb sym/file1 sub/sym/file2
+tar -xf a.tar -C ext
+echo >&2 astatus=$?
+echo >&2 =====
+tar -xf b.tar -C ext
+echo >&2 bstatus=$?
+diff victimdir/expected victimdir/file1
+diff victimdir/expected victimdir/file2
+],
+[0],
+[],
+[tar: sub/sym: symlinking to 'dotdot/_./victimdir' instead of to 'dotdot/../victimdir'
+tar: sym: symlinking to '_./victimdir' instead of to '../victimdir'
+tar: Exiting with failure status due to previous errors
+astatus=2
+=====
+tar: sym: Cannot mkdir: File exists
+tar: sym/file1: Cannot open: No such file or directory
+tar: sub/sym: Cannot mkdir: File exists
+tar: sub/sym/file2: Cannot open: No such file or directory
+tar: Exiting with failure status due to previous errors
+bstatus=2
+],
+[])
+AT_CLEANUP
diff --git a/tests/testsuite.at b/tests/testsuite.at
index e7e54f1e..c4912242 100644
--- a/tests/testsuite.at
+++ b/tests/testsuite.at
@@ -357,6 +357,7 @@ m4_include([extrac27.at])
 m4_include([extrac28.at])
 m4_include([extrac29.at])
 m4_include([extrac30.at])
+m4_include([extrac31.at])
 
 m4_include([backup01.at])
 
-- 
2.34.1

Reply via email to