Re: shell: swapping var values in one command, plus some here doc stuff

Geoff Clare via austin-group-l at The Open Group Tue, 07 Sep 2021 02:29:57 -0700

Robert Elz wrote, on 06 Sep 2021:
>
>   | Which ones do it the other way?  All the shells I tried (bash, dash, 
> ksh88,
>   | ksh93, mksh) do it "right".
> 
> FreeBSD (or at least the slightly old version I have to test) and NetBSD:
> 
> sh $ a=foo; b=bar; a=$b b=$a; echo $a$b
> barfoo
> sh $ a=foo; b=bar; unset t; t=$a a=$b b=$t; echo $a$b
> bar
> 
> That's from the NetBSD sh, FreeBSD's is (or was) just the same.
> That means, I think, that once upon a time, dash would have been
> like that too.
> 
> Which is "right" doesn't seem entirely clear to me.  ash, from which
> those 3 are derived generally tried to copy the original Bourne sh
> (7th edition, not some Solaris of Sys V version Joerg) (and not any ksh
> version) for things that existed in it, so it may be that it did it this
> way as well.   The text in POSIX doesn't really say one way or the other.


The fact that POSIX explicitly says it's unspecified for assignments
before a command that is not a special built-in or function implies
that (the authors believed) it is specified for other cases.

When the POSIX shell requirements were originally developed for
POSIX.2-1992 the intention was to describe the behaviour of ksh88
with a few deliberate changes (as documented in the rationale, where
nothing is said about a change in this area).

Therefore the intention was for POSIX to require the ksh88 behaviour
here, so I believe that's the "right" behaviour even if the standard is
not entirely clear it is required.

>   | The one that's in effect when the <<-EOF redirection is evaluated,
>   | i.e. /some/file.  Again, all the shells I tried do it that way.
> 
> In that situation the FreeBSD shell writes to whatever fd 3 was before
> the command...
> 
> fbsh $ exec 3>/tmp/bar
> fbsh $ cat 3>/tmp/foo <<EOF
> text
> $(echo foo >&3)
> EOF
> text
> 
> fbsh $ cat /tmp/foo
> fbsh $ cat /tmp/bar
> foo

And this must be another case where dash changed to match other shells.

>   | The entire script is being split into tokens.  2.3 says "When an
>   | io_here token has been recognized by the grammar (see Section 2.10),
>   | one or more of the subsequent lines immediately following the next
>   | NEWLINE token form the body of one or more here-documents and shall be
>   | parsed according to the rules of Section 2.7.4."
>   | 2.7.4 says "The here-document shall be treated as a single word that
>   | begins after the next <newline> and continues until there is a line
>   | containing only the delimiter and a <newline>, with no <blank>
>   | characters in between."
> 
> Sure, but that still doesn't jump up and say to me that the \newlines
> must be removed before the end token is recognised.   Doing so means that
> there can't be one simple common routine for collecting here doc data,
> as whether that happens depends on whether the end word was quoted or
> not.
> 
> And like I said, this is a case where there aren't just a couple of perhaps
> outlier shells doing it what you believe is the "wrong" way, ignoring ksh93
> (which would probably be on your side of the fence) this is a case where
> there's about a 50-50 split between shells.
> 
> Of course, in real life, no-one puts \newlines in the middle of the end
> delimiter line, so this makes not one jot of difference to anything that
> matters.

Agreed, but it matters to avoid prematurely finding the delimiter when
it occurs on a continuation line and so should be part of the here-doc
(see below).  Getting that right means that <backslash><newline> within
the delimiter would naturally also be handled correctly.

>   | So the requirement from 2.2.1 is that <backslash><newline> is removed
>   | before the here-doc is tokenised as a single word as per 2.3 and 2.7.4.
> 
> And it seems that lots of people don't jump to that conclusion (one that
> doesn't is yash, which is supposed to be an implementation written using
> the standard as its rulebook - unlike most of the others which mostly just
> copy some other implementations behaviour, in many cases of course because
> their origins are from before any standard existed).
> 
> yash $ cat <<EOF
> > foo
> > E\
> > O\
> > F
> > EOF
> foo
> EOF
> yash $
> 
> (Sorry, I haven't turned off PS2 in my yash shell setup).   All 3 ash
> derived shells are the same, so probably the original Bourne sh was too.
> 
> Note that things aren't as clear as you make out, as the end delimiter
> is not part of the here doc word, and so isn't subject to its content
> or parsing rules (it is not part of the double quoted, more or less, word).

But, as I mentioned above, if the shell looks for the delimiter before
it removes <backslash><newline> instead of after, it will get this
case wrong:

$ cat <<EOF
> foo\
> EOF
> EOF
fooEOF
$ 

The foo\ and the first EOF line are required to be parsed as part of the
here-doc.  The second EOF line is then the delimiter.

-- 
Geoff Clare <[email protected]>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: shell: swapping var values in one command, plus some here doc stuff

Reply via email to