Issue7+TC2 0001645]: execvp( ) requirements on arg0 are too strict

Austin Group Bug Tracker via austin-group-l at The Open Group Wed, 22 Mar 2023 12:49:32 -0700


The following issue has been SUBMITTED. 
====================================================================== 
https://www.austingroupbugs.net/view.php?id=1645 
====================================================================== 
Reported By:                eblake
Assigned To:                
====================================================================== 
Project:                    1003.1(2016/18)/Issue7+TC2
Issue ID:                   1645
Category:                   System Interfaces
Type:                       Clarification Requested
Severity:                   Objection
Priority:                   normal
Status:                     New
Name:                       Eric Blake 
Organization:               Red Hat 
User Reference:             ebb.execvp 
Section:                    XSH exec 
Page Number:                784 
Line Number:                26548 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2023-03-22 19:47 UTC
Last Modified:              2023-03-22 19:47 UTC
====================================================================== 
Summary:                    execvp( ) requirements on arg0 are too strict
Description: 
The standard is clear that execlp() and execvp() cannot fail with ENOEXEC
(except in the extremely unlikely event that attempting to overlay the
process with sh also fails with that error), but must instead attempt to
re-execute sh with a command line set so that sh will execute the desired
filename as a shell script.  Furthermore, the standard is explicit that the
original:<blockquote>execvl(<i>file</i>, <i>arg0</i>, <i>arg1</i>, ...,
NULL)</blockquote>
is retried as:<blockquote>execl(<i>shell path</i>, <i>arg0</i>,
<i>file</i>, <i>arg1</i>, ..., NULL)</blockquote>


that is, whatever name was passed in argv[0] in the original attempt should
continue to be the argv[0] seen by the sh process that will be parsing
file.

But in practice, this does not actually happen on a number of systems. 
Here is an email describing bugs found in three separate projects (busybox,
musl libc, and glibc) while investigating why attempting to rely on what
the standard says about execvp() fallback behavior fails on Alpine Linux:
https://listman.redhat.com/archives/libguestfs/2023-March/031135.html

In particular:
1. busybox installs /bin/sh as a multi-name binary, whose behavior DEPENDS
on argv[0] ending in a basename of sh.  If execvp() actually calls
execl("/bin/sh", arg0, file, ...), the binary installed at /bin/sh will NOT
see 'sh' as its basename but instead whatever is in arg0, and fails to
behave as sh. (Bug filed at https://bugs.busybox.net/show_bug.cgi?id=15481
asking the busybox team to consider installing a minimal shim for /bin/sh
that is NOT dependent on argv[0])
2. musl currently refuses to do ENOEXEC handling (a knowing violation of
POSIX, but the alternative requires coordinating the allocation of memory
to provide the space for the larger argv entailed by injecting /bin/sh into
the argument list); see https://www.openwall.com/lists/musl/2020/02/12/9
which acknowledges the issue, where Adélie Linux has patched musl for
POSIX compliance but upstream musl does not like the patch. This followup
mail surveyed the behavior of various other libc; many use VLA to handle
things, but musl argues that VLA is itself prone to bugs
https://www.openwall.com/lists/musl/2020/02/13/3.  Arguably, musl's claim
that execvp() must be safe to use after vfork() can therefore not use
malloc() is a bit of a stretch (the standard explicitly documents that
execlp() and execvp() need not be async-signal-safe; and even though we've
deprecated vfork(), the arguments about what is safe after vfork() roughly
correspond to the same arguments about what async-signal-safe functions can
be used between regular fork() and exec*()).
3. glibc does ENOEXEC handling, but passes "/bin/sh" rather than arg0 as
the process name of the subsequent shell invocation, losing any ability to
expose the original arg0 to the script. 
https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/execvpe.c;h=871bb4c4#l51
shows that the fallback executes is the equivalent to execl("/bin/sh",
"/bin/sh", file, arg1, ...)

Admittedly, Linux in general, and particularly Alpine Linux, will
intentionally diverge from POSIX any time they feel it practical; but we
should still consider whether the standard is too strict in requiring
argv[0] to pass through unchanged to the script when the fallback kicks in.
 And I think the real intent is less about what sh's argv[0] is, and more
about what the script's $0 is.

Even historically, FreeBSD used to pass in "sh" rather than preserving
arg0, up until 2020: https://cgit.freebsd.org/src/commit/?id=301cb491ea. 
And _requiring_ arg0 to be used unchanged falls apart when a user invokes
execlp("binary", NULL, NULL) (such behavior is non-conforming, since line
26559 states "The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec functions.",
but a fallback to execl("/bin/sh", NULL, "binary", NULL) obviously won't do
what is intended, so the library has to stick something there).

Why don't we see complaints about this more frequently?  Well, for
starters, MOST people install shell scripts (or even scripts designed for
other interpreters) with a #! shebang line.  The standard is explicit that
this is outside the realm of the standards (because different systems
behave differently on how that first line is parsed to determine which
interpreter to invoke), but at least on Linux, a script with a #! line
NEVER fails with ENOEXEC - that aspect is handled by the kernel.  The only
time you ever get to a glibc or musl fallback that even has to worry about
ENOEXEC is when the script has no leading #! line, which tends to not be
common practice (even though the standard seems to imply otherwise). 
Additionally, most shells don't directly call execvp() - they instead do
their _own_ PATH lookup, and then use execl() or similar - if that fails
with ENOEXEC, the shell itself can then immediately parse the file contents
with the desired $0 already in place, without having to rely on execvp() to
try to spawn yet another instance of sh for the purpose.

In playing with this, I note that the as-if rule might permit:<blockquote>
execl("/bin/sh", "sh", "-c", ". <i>quoted_filename</i>", <i>arg0</i>,
<i>arg1</i>, ..., NULL)</blockquote>
where quoted_filename is created by quoting the original file in such a way
that the shell sees the original name after processing quoting rules (so as
not to open a security hole when file contains shell metacharacters) has
roughly the same effect as execl("/bin/sh", arg0, file, arg1, ..., NULL) -
in that it kicks off a shell invocation that executes commands from the
given file while $0 is set to the original name.  It additionally has the
benefits that it will work on a system with busybox as /bin/sh (because
busybox still sees "sh" as argv[0], but also has enough knowledge of what
to store into $0 for the duration of sourcing the file).  So I went ahead
and included a mention of that in non-normative RATIONALE - but we may
decide to drop that.  Why?  Because we took pains in bug:953 to clarify
that the dot utility might parse a file as either a <i>program</i> or a
<i>compound_list</i>, while the 'sh file arg1' form requires parsing as a
program, so it might create an observable difference if this alternative
fallback ends up parsing as a <i>compound_list</i>  (or we might also
decide to tweak the proposed normative text to allow for this difference in
parsing).  What's more, if musl is already complaining about injecting
"/bin/sh" into argv as being hard to do safely given memory constraints
after <i>vfork</i>( ), it will be even harder to argue in favor of creating
the string ". <i>quoted_filename</i>>", which requires even more memory.

In parallel with this, I'm planning to open a bug report against glibc to
see if they will consider making the same change as FreeBSD did in 2020 of
preserving arg0 to the eventual script.  But they may reply that it risks
breaking existing clients that have come to depend on the fallback passing
$0 as a variant of "sh" rather than the original arg0, therefore my
proposal here is to relax the requirements of the standard to allow more
existing implementations to be rendered compliant as-is, even though it
gives up the nice $0 guarantees.

I also wonder if the standard should consider adding support for 'exec -a
arg0 cmd arg1...', which is another common implementation extension in many
sh versions for setting argv[0] of the subsequent cmd.  That belongs in a
separate bug report, if at all.  But by the as-if rule, an implementation
with that extension might use execl("/bin/sh", "sh", "-c", "exec -a \"$0\"
<i>quoted_file</i> \"$@\"", arg0, arg1, ..., NULL) as a way to execute the
correct file with the desired $0 even if it can't use the proposed
<i>dot</i> trick due to difference in parse scope.

Desired Action: 
line numbers from Issue 7 + TC2 (POSIX 2017), although the same text
appears in draft 3 of issue 8.

At page 784 lines 26552-26557 (XSH exec DESCRIPTION),
change:<blockquote>...the executed command shall be as if the process
invoked the <i>sh</i> utility using <i>execl</i>( ) as follows:
<tt>execl(<shell path>, arg0, file, arg1, ..., (char *)0);</tt>
where < <i>shell path</i> > is an unspecified pathname for the <i>sh</i>
utility, <i>file</i> is the process image file, and for <i>execvp</i>( ),
where <i>arg0</i>, <i>arg1</i>, and so on correspond to the values passed
to <i>execvp</i>( ) in <i>argv</i>[0], <i>argv</i>[1], and so
on.</blockquote>to:<blockquote>...the executed command shall be as if the
process invoked the <i>sh</i> utility using <i>execl</i>( ) as follows:
<tt>execl(<shell path>, <name>, file, arg1, ..., (char *)0);</tt>
where < <i>shell path</i> > is an unspecified pathname for the <i>sh</i>
utility, < <i>name</i> > is an unspecified process name, <i>file</i> is the
process image file, and for <i>execvp</i>( ), where <i>arg1</i>,
<i>arg2</i>, and so on correspond to the values passed to <i>execvp</i>( )
in <i>argv</i>[1], <i>argv</i>[2], and so on.</blockquote>

After page 794 line 26981 (XSH exec RATIONALE), add a new
paragraph:<blockquote>When <i>execlp</i>( ) or <i>execvp</i>( ) fall back
to invoking <i>sh</i> because of an ENOEXEC condition, the standard leaves
the process name (what becomes argv[0] in the resulting sh process)
unspecified.  Existing implementations vary on whether they pass a
variation of "sh", or preserve the original <i>arg0</i>.  There are
existing implementations of <i>sh</i> that behave differently depending on
the contents of argv[0], such that blindly passing the original <i>arg0</i>
on to the fallback execution can fail to invoke a compliant shell
environment.  An implementation may instead utilize <tt>execl(<shell name>,
"sh", "-c", ". <quoted_file>", arg0, arg1, ..., NULL)</tt>, where
<i>quoted_file</i> is created by escaping any characters special to the
shell, as a way to expose the original $0 to the shell commands contained
within <i>file</i> without breaking <i>sh</i> sensitive to the contents of
argv[0].
====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2023-03-22 19:47 eblake         New Issue                                    
2023-03-22 19:47 eblake         Name                      => Eric Blake      
2023-03-22 19:47 eblake         Organization              => Red Hat         
2023-03-22 19:47 eblake         User Reference            => ebb.execvp      
2023-03-22 19:47 eblake         Section                   => XSH exec        
2023-03-22 19:47 eblake         Page Number               => 784             
2023-03-22 19:47 eblake         Line Number               => 26548           
2023-03-22 19:47 eblake         Interp Status             => ---             
======================================================================

[1003.1(2016/18)/Issue7+TC2 0001645]: execvp( ) requirements on arg0 are too strict

Reply via email to