On 2026-01-29 11:43, Pavel Cahyna wrote:
the recent openat2 changes broke --one-top-level with an absolute path as an argument (AFAICS, there is no test for that case).
Also, using an absolute path would appear to contradict the manual, which says that the directory must be "beneath the extraction directory (or the one passed to '-C')".
How about if we start by disallowing absolute directories as the argument of --one-top-level? That would match the documentation, and would simplify whatever other fixes we do. It is not at all clear how an absolute directory would fit in with multiple -C options, for example, or whether that would even be useful. And it is also not clear how an absolute directory should interact with -P.
When using for example --one-top-level=subdir, a tar archive containing target/sensitive can overwrite ./sensitive if there is a preexisting symlink subdir/target -> ../target . I.e. the archive extraction can escape subdir. I think this is unexpected
To fix this, along with some other problems in the neighborhood that you didn't go into, I suggest we change how we treat --one-top-level, as follows:
1. We append its DIR argument (or the inferred DIR, if there is no option-argument) to any directory specified via '-C'. Without -C we use DIR as-is. I.e., with --one-top-level=DIR -C FOO, we open the directory FOO/DIR and use that for all extraction; we do not use FOO for extraction. And without -C, we open DIR and use that instead of AT_FDCWD.
2. When we open FOO/DIR (or DIR), we do so via openat2 so that we know that DIR does not escape FOO (or escape "." if there is no FOO).
3. If FOO/DIR (or DIR) cannot be opened, we mkdir it, by opening its parent directory with openat2 (so that the parent directory does not escape FOO) and then using mkdir on that directory. I.e., just one level: we should not use the equivalent of mkdir -p.
4. We do (2) and (3) lazily, i.e., only when we need FOO/DIR (or DIR) open to extract a file underneath it.
5. Once we've done either (2) or (3) for FOO/DIR, we close the file descriptor for FOO because we don't need it any more.
One possible, stricter alternative is to insist on (3) rather than (2). I.e., we do not allow --one-top-level to extract over an existing subdirectory. This would be more in the spirit of --one-top-level as I understand it.
