The following issue has been SUBMITTED. ====================================================================== https://www.austingroupbugs.net/view.php?id=1645 ====================================================================== Reported By: eblake Assigned To: ====================================================================== Project: 1003.1(2016/18)/Issue7+TC2 Issue ID: 1645 Category: System Interfaces Type: Clarification Requested Severity: Objection Priority: normal Status: New Name: Eric Blake Organization: Red Hat User Reference: ebb.execvp Section: XSH exec Page Number: 784 Line Number: 26548 Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2023-03-22 19:47 UTC Last Modified: 2023-03-22 19:47 UTC ====================================================================== Summary: execvp( ) requirements on arg0 are too strict Description: The standard is clear that execlp() and execvp() cannot fail with ENOEXEC (except in the extremely unlikely event that attempting to overlay the process with sh also fails with that error), but must instead attempt to re-execute sh with a command line set so that sh will execute the desired filename as a shell script. Furthermore, the standard is explicit that the original:<blockquote>execvl(<i>file</i>, <i>arg0</i>, <i>arg1</i>, ..., NULL)</blockquote> is retried as:<blockquote>execl(<i>shell path</i>, <i>arg0</i>, <i>file</i>, <i>arg1</i>, ..., NULL)</blockquote>
that is, whatever name was passed in argv[0] in the original attempt should continue to be the argv[0] seen by the sh process that will be parsing file. But in practice, this does not actually happen on a number of systems. Here is an email describing bugs found in three separate projects (busybox, musl libc, and glibc) while investigating why attempting to rely on what the standard says about execvp() fallback behavior fails on Alpine Linux: https://listman.redhat.com/archives/libguestfs/2023-March/031135.html In particular: 1. busybox installs /bin/sh as a multi-name binary, whose behavior DEPENDS on argv[0] ending in a basename of sh. If execvp() actually calls execl("/bin/sh", arg0, file, ...), the binary installed at /bin/sh will NOT see 'sh' as its basename but instead whatever is in arg0, and fails to behave as sh. (Bug filed at https://bugs.busybox.net/show_bug.cgi?id=15481 asking the busybox team to consider installing a minimal shim for /bin/sh that is NOT dependent on argv[0]) 2. musl currently refuses to do ENOEXEC handling (a knowing violation of POSIX, but the alternative requires coordinating the allocation of memory to provide the space for the larger argv entailed by injecting /bin/sh into the argument list); see https://www.openwall.com/lists/musl/2020/02/12/9 which acknowledges the issue, where Adélie Linux has patched musl for POSIX compliance but upstream musl does not like the patch. This followup mail surveyed the behavior of various other libc; many use VLA to handle things, but musl argues that VLA is itself prone to bugs https://www.openwall.com/lists/musl/2020/02/13/3. Arguably, musl's claim that execvp() must be safe to use after vfork() can therefore not use malloc() is a bit of a stretch (the standard explicitly documents that execlp() and execvp() need not be async-signal-safe; and even though we've deprecated vfork(), the arguments about what is safe after vfork() roughly correspond to the same arguments about what async-signal-safe functions can be used between regular fork() and exec*()). 3. glibc does ENOEXEC handling, but passes "/bin/sh" rather than arg0 as the process name of the subsequent shell invocation, losing any ability to expose the original arg0 to the script. https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/execvpe.c;h=871bb4c4#l51 shows that the fallback executes is the equivalent to execl("/bin/sh", "/bin/sh", file, arg1, ...) Admittedly, Linux in general, and particularly Alpine Linux, will intentionally diverge from POSIX any time they feel it practical; but we should still consider whether the standard is too strict in requiring argv[0] to pass through unchanged to the script when the fallback kicks in. And I think the real intent is less about what sh's argv[0] is, and more about what the script's $0 is. Even historically, FreeBSD used to pass in "sh" rather than preserving arg0, up until 2020: https://cgit.freebsd.org/src/commit/?id=301cb491ea. And _requiring_ arg0 to be used unchanged falls apart when a user invokes execlp("binary", NULL, NULL) (such behavior is non-conforming, since line 26559 states "The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.", but a fallback to execl("/bin/sh", NULL, "binary", NULL) obviously won't do what is intended, so the library has to stick something there). Why don't we see complaints about this more frequently? Well, for starters, MOST people install shell scripts (or even scripts designed for other interpreters) with a #! shebang line. The standard is explicit that this is outside the realm of the standards (because different systems behave differently on how that first line is parsed to determine which interpreter to invoke), but at least on Linux, a script with a #! line NEVER fails with ENOEXEC - that aspect is handled by the kernel. The only time you ever get to a glibc or musl fallback that even has to worry about ENOEXEC is when the script has no leading #! line, which tends to not be common practice (even though the standard seems to imply otherwise). Additionally, most shells don't directly call execvp() - they instead do their _own_ PATH lookup, and then use execl() or similar - if that fails with ENOEXEC, the shell itself can then immediately parse the file contents with the desired $0 already in place, without having to rely on execvp() to try to spawn yet another instance of sh for the purpose. In playing with this, I note that the as-if rule might permit:<blockquote> execl("/bin/sh", "sh", "-c", ". <i>quoted_filename</i>", <i>arg0</i>, <i>arg1</i>, ..., NULL)</blockquote> where quoted_filename is created by quoting the original file in such a way that the shell sees the original name after processing quoting rules (so as not to open a security hole when file contains shell metacharacters) has roughly the same effect as execl("/bin/sh", arg0, file, arg1, ..., NULL) - in that it kicks off a shell invocation that executes commands from the given file while $0 is set to the original name. It additionally has the benefits that it will work on a system with busybox as /bin/sh (because busybox still sees "sh" as argv[0], but also has enough knowledge of what to store into $0 for the duration of sourcing the file). So I went ahead and included a mention of that in non-normative RATIONALE - but we may decide to drop that. Why? Because we took pains in bug:953 to clarify that the dot utility might parse a file as either a <i>program</i> or a <i>compound_list</i>, while the 'sh file arg1' form requires parsing as a program, so it might create an observable difference if this alternative fallback ends up parsing as a <i>compound_list</i> (or we might also decide to tweak the proposed normative text to allow for this difference in parsing). What's more, if musl is already complaining about injecting "/bin/sh" into argv as being hard to do safely given memory constraints after <i>vfork</i>( ), it will be even harder to argue in favor of creating the string ". <i>quoted_filename</i>>", which requires even more memory. In parallel with this, I'm planning to open a bug report against glibc to see if they will consider making the same change as FreeBSD did in 2020 of preserving arg0 to the eventual script. But they may reply that it risks breaking existing clients that have come to depend on the fallback passing $0 as a variant of "sh" rather than the original arg0, therefore my proposal here is to relax the requirements of the standard to allow more existing implementations to be rendered compliant as-is, even though it gives up the nice $0 guarantees. I also wonder if the standard should consider adding support for 'exec -a arg0 cmd arg1...', which is another common implementation extension in many sh versions for setting argv[0] of the subsequent cmd. That belongs in a separate bug report, if at all. But by the as-if rule, an implementation with that extension might use execl("/bin/sh", "sh", "-c", "exec -a \"$0\" <i>quoted_file</i> \"$@\"", arg0, arg1, ..., NULL) as a way to execute the correct file with the desired $0 even if it can't use the proposed <i>dot</i> trick due to difference in parse scope. Desired Action: line numbers from Issue 7 + TC2 (POSIX 2017), although the same text appears in draft 3 of issue 8. At page 784 lines 26552-26557 (XSH exec DESCRIPTION), change:<blockquote>...the executed command shall be as if the process invoked the <i>sh</i> utility using <i>execl</i>( ) as follows: <tt>execl(<shell path>, arg0, file, arg1, ..., (char *)0);</tt> where < <i>shell path</i> > is an unspecified pathname for the <i>sh</i> utility, <i>file</i> is the process image file, and for <i>execvp</i>( ), where <i>arg0</i>, <i>arg1</i>, and so on correspond to the values passed to <i>execvp</i>( ) in <i>argv</i>[0], <i>argv</i>[1], and so on.</blockquote>to:<blockquote>...the executed command shall be as if the process invoked the <i>sh</i> utility using <i>execl</i>( ) as follows: <tt>execl(<shell path>, <name>, file, arg1, ..., (char *)0);</tt> where < <i>shell path</i> > is an unspecified pathname for the <i>sh</i> utility, < <i>name</i> > is an unspecified process name, <i>file</i> is the process image file, and for <i>execvp</i>( ), where <i>arg1</i>, <i>arg2</i>, and so on correspond to the values passed to <i>execvp</i>( ) in <i>argv</i>[1], <i>argv</i>[2], and so on.</blockquote> After page 794 line 26981 (XSH exec RATIONALE), add a new paragraph:<blockquote>When <i>execlp</i>( ) or <i>execvp</i>( ) fall back to invoking <i>sh</i> because of an ENOEXEC condition, the standard leaves the process name (what becomes argv[0] in the resulting sh process) unspecified. Existing implementations vary on whether they pass a variation of "sh", or preserve the original <i>arg0</i>. There are existing implementations of <i>sh</i> that behave differently depending on the contents of argv[0], such that blindly passing the original <i>arg0</i> on to the fallback execution can fail to invoke a compliant shell environment. An implementation may instead utilize <tt>execl(<shell name>, "sh", "-c", ". <quoted_file>", arg0, arg1, ..., NULL)</tt>, where <i>quoted_file</i> is created by escaping any characters special to the shell, as a way to expose the original $0 to the shell commands contained within <i>file</i> without breaking <i>sh</i> sensitive to the contents of argv[0]. ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2023-03-22 19:47 eblake New Issue 2023-03-22 19:47 eblake Name => Eric Blake 2023-03-22 19:47 eblake Organization => Red Hat 2023-03-22 19:47 eblake User Reference => ebb.execvp 2023-03-22 19:47 eblake Section => XSH exec 2023-03-22 19:47 eblake Page Number => 784 2023-03-22 19:47 eblake Line Number => 26548 2023-03-22 19:47 eblake Interp Status => --- ======================================================================
