Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)

Harald van Dijk via austin-group-l at The Open Group Sun, 12 Mar 2023 19:20:07 -0700

On 12/03/2023 19:10, Robert Elz wrote:

     Date:        Fri, 10 Mar 2023 23:40:18 +0000
     From:        "Harald van Dijk via austin-group-l at The Open Group" 
<austin-group-l@opengroup.org>
     Message-ID:  <ec9488fb-943d-dcaf-0885-c7d76611d...@gigawatt.nl>


   | Based on past experiences, I am assuming the e-mail this is a reply to
   | was meant to be sent to the list and I am quoting it in full and
   | replying on the list for that reason.

Thanks, and yes, it was - my MUA absolutely believes in the one true
meaning of Reply-To (where the author of the message to which the reply
is being sent requests that replies be sent -- to addresses in that field,
and no others).   I need to manually override it when I choose to ignore
that request and send to different addresses (which is allowed, but in
general, done only with proper consideration of why).   This list always
directs that all replies go only to the author of the message, and never
to the list itself.   Irritating...

I know. Even more frustrating is the reasoning we have been given as tothe reasoning for it ("to better handle DMARC email authentication formessages" -- no, DMARC absolutely does not require that).

   | Sourcing arbitrary script fragments and having assurance that they do
   | not exit the shell is not reasonable, as the arbitrary script fragment
   | could contain an 'exit' command.

Of course, deliberate exits aren't the issue, only accidental ones.

   | Beyond shell options and variable assignments not persisting in the
   | parent shell, are there any other issues you see with running them in a
   | subshell?

The whole point of many . scripts is to alter the shell's environment,
if they were just arbitrary commands, not intended to affect the current
shell, they'd just be sh scripts, and run the normal way.   The very act
of using the '.' command more or less means "must run in the current shell).

"(. file)" is silly, "file" would accomplish the same thing (if executable,
otherwise "sh < file" after finding the path to file) in a more obvious way.

This does not result in the same behaviour. (. file) can be used toinherit unexported shell variables, shell functions, aliases, etc. fromthe parent shell that would not be available to an explicitly launchedsh subprocess.

I agree that it is likely that a . script would be sourced in order tochange the current shell environment, but I do not agree that it is theonly legitimate use of it, and I had assumed based on your "withoutrisking the shell exiting" that you were considering one of those otheruses.

Apart from options and variables, . files often define functions, change
the umask and perhaps ulimit, and may alter the current directory, set exit
(or other) traps, ...  anything in fact.


Sure, they are in the same category as options and variables.

As an example, consider what you might put in your .profile or $ENV
file - those are run in more or less the same way as a '.' file (just
without the PATH search to locate the file).   XRAT C.2.5.3 says almost
exactly that about ENV.

Indeed. Worth explicitly pointing out here is that they are only used byinteractive shells, so in this comparison, only the behaviour of the '.'command in interactive shells is relevant.

                          (Strangely though, even though .profile is
mentioned several times as a place where things can be set, it doesn't
appear in the standard (as something that shells process) at all - which
is kind of odd really, since it is considerably older then ENV, and as best
I can tell, supported by everything.   The closest that we get is a mention
in XRAT that "some shells" run it at startup of a login shell.   Which are
the other shells?  That is, the ones that don't run .profile?   And I don't
mean in situations like bash, which prefers .bash_profile if it exists.


bash appears to disables the reading of .profile in POSIX mode entirely.

yash only ever appears to source .yash_profile, it does not fall back to.profile if no .yash_profile exists.

I doubt that you'd want those scripts run in a subshell environment,
I also doubt that you want the shell to exit if there's an error in
one of them.   How would you ever be able to log in (and start a shell)
if it exited before you ever had a chance to run a command?   If you can't
log in, because your shell won't start, how would you ever fix the problem?

As best I can tell (I have done very limited testing of this) shells tend
to simply abort processing one of those scripts upon encountering an error
(like a syntax error, etc - not executing "exit" - that should exit) and
just go on to the next step of initialising the shell.   They don't just
exit because there's a syntax error - most shells report the error (not all),
but I couldn't find one which exits.

This is indeed a problem. My own testing supports your conclusion,shells are in agreement that a syntax error here does not result in theshell terminating, and does result in the remainder of the envfile beingignored. An invalid option in an envfile, however, is different. Thatdoes cause bash to terminate (tested with 'set -$') all the way frombash 2 to bash 5.2. Other shells do not terminate, but vary in whetherthe remainder of the envfile is executed. In mine, in ksh, and in yash,it is. In other shells, it isn't.

I do not know what the most appropriate behaviour is, but I know thefrustration of shells not starting. My own home system's .profile has,on occasion, triggered a bug in libedit that resulted in my shellsegfaulting. This was difficult to recover from. And to some extent,this does actually also apply to 'exit' commands put in such files.

My main point here is that, despite the rationale, these files arealready handled differently from the '.' command, and I agree with yourconcerns, it is important that their handling is not made more strictwithout very careful consideration of the impact of such a change.

   | You have left out bash 4 here.

For the same reason I didn't include ancient versions of all the
other shells either.   That's obsolete, not going to change in the
future, and has been replaced.   [And because I happen not to have
a binary of it at the minute - I could make one, I do have sources,
just don't really see the need.]

I suspect version 4 of bash is still widely used: it is still theversion provided by several LTS GNU/Linux distributions, for instanceUbuntu 18.04 which will remain supported and continue to receive updatesuntil April this year. Its behaviour will be what its users areaccustomed to.

Likewise, Geoff has previously commented on this list based on bash 3behaviour, on the basis that that is the version provided on macOS.

The reason that read errors are different in this regard (at least in
the main script, not in . files -- not sure it is possible to have an
equivalent to a read error in "eval" - perhaps an EILSEQ (bad char encoding)
in the string might count?

When I added multibyte support to my shell, I encountered too manyproblems when treating bytes that do not form valid multibyte charactersas an error, and I did not find any other shell that did so. (An notablealmost-exception there is yash, which *would* error if it allowed suchstrings through in the first place, but it doesn't.) So I matched myshell to all the other shells -- other than yash -- that I could find,and treat bytes that do not form valid multibyte characters as separatecharacters in their own right.

                              and that the shell needs to exit, is that
there is nothing else that it can do - if reading a command just gives
an error, deciding to go back and read another command is folly.   That's
different than syntax errors, they're not really the same thing, and don't
necessarily need to be treated the exact same way.

Actually, this is not the case. I hinted that when actually implementingthis, other shell authors would notice the same. There are severalcategories of read errors that the shell *can* handle, and that dashalready handled before I made changes to it.

If a read() call fails because of EINTR, because the shell received asignal, it should obviously just process the signal and then continuereading.

If a read() call fails because of EAGAIN or EWOULDBLOCK, and the filedescriptor has O_NONBLOCK set, it is possible for the shell to clear theO_NONBLOCK flag and try again. dash does so for EWOULDBLOCK, my shellhas not yet made changes to this, and I have no opinion at this time onwhether that is the right thing to do.

These are potentially more read errors that shells can handle withoutexiting.


Cheers,
Harald van Dijk

Re: Syntax error with "command . file" (was: [1003.1(2016/18)/Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)

Reply via email to