Re: env: add -S option (split string for shebang lines in scripts)

Eric Blake Fri, 27 Apr 2018 12:43:29 -0700

On 04/26/2018 02:17 AM, Assaf Gordon wrote:
> Hello,
> 
> Attached an updated patch, hopefully addressing all the issues below.
> Initial documentation also included (seperated into several commits to ease 
> review).


Question - do we want to use 'S::' instead of 'S:' in the optstring?
Right now, your patches made -S take a mandatory argument, making:

#!/usr/bin/env -S

try to parse the script name as a string to be split (and unless the
script name has unusual characters, this results in an infloop of
treating the script name as the interpreter, for another round of trying
to exec the same command line).  Although this is probably not what was
intended, and the shebang line is already suspicious for not providing
more text after -S, making the argument optional and then issuing an
error message if optarg is NULL would at least make this not an easy
infloop.

I'm also trying to think what happens if we want to support platforms
where the OS splits strings passed to shebang.  (The BSD implementation
didn't have to worry quite as much about their code being run on a
different OS, like we do).  Consider:

#!/usr/bin/env -S interpreter 'arg with space'

where it already sees "-S", "interpreter", "'arg", "with", "space'",
"script", "args..." as separate arguments.  If we use 'S:' to
getopt_long, then "interpreter" will be subject to -S handling but
nothing else will; if we use 'S::', then none of the subsequent
arguments will be subject to -S handling (but then we have to revisit
whether a NULL optarg would be treated as an error on a shebang line
that ends in -S).  But either way, it would be nice if we could
reconstruct "arg with space" as a single argument to hand to
"interpreter", rather than three separate arguments where two of them
include a lone "'".

I'm wondering if we need yet another magic environment variable for
portably marking the demarcation between the arguments to -S and the
script name, whether the script is run on a platform that hands -S a
single string, or run on a platform that splits arguments, as in:

#!/usr/bin/env -S interpreter 'arg with  spaces' ${_ENV_END}

which on Linux calls "/usr/bin/env" "-S interpreter 'arg with  spaces'
${ENV_END}" "script", but elsewhere calls "/usr/bin/env" "-S"
"interpreter" "'arg" "with" "spaces" "${_ENV_END}" "script".  With the
magic marker in place, -S can be used as a toggle mode that says to look
if ANY later argument is the magic marker ${ENV_END}, or maybe do this
look forward only if optarg for -S contains no spaces, because if a
space is present at all, we know the kernel did not split the shebang
line.  If no spaces are present in optarg, but ${_ENV_END} is present as
a later argument, then then attempt to reconstruct the same command line
as if all arguments in between had been a single string (so that quote
and escape processing is performed on the remaining arguments of the
shebang line, but not on the script name or arguments).  If ${_ENV_END}
is not present, we can't make any assumptions, so we only perform string
splitting on optarg, rather than trying to reconstruct a string from a
subset of the remaining arguments.

Of course, reconstructing a single string can't tell what whitespace the
kernel ate in providing multiple arguments, so it will corrupt multiple
spaces and/or tabs down to a single space; perhaps the existing \_
escape sequence can be used to overcome the worst effects of that.  We'd
probably want to document that the expansion of ${_ENV_END} is always
empty, even if someone defines that variable in the environment?

Another question: Does the BSD implementation have any way to pass empty
strings as explicit arguments?  The code you posted turns:

#!/usr/bin/env -S sh -c '' echo

into "sh" "-c" "echo" "script", which did NOT preserve the empty string.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: env: add -S option (split string for shebang lines in scripts)

Reply via email to