Robert Elz wrote, on 06 Sep 2021: > > | Which ones do it the other way? All the shells I tried (bash, dash, > ksh88, > | ksh93, mksh) do it "right". > > FreeBSD (or at least the slightly old version I have to test) and NetBSD: > > sh $ a=foo; b=bar; a=$b b=$a; echo $a$b > barfoo > sh $ a=foo; b=bar; unset t; t=$a a=$b b=$t; echo $a$b > bar > > That's from the NetBSD sh, FreeBSD's is (or was) just the same. > That means, I think, that once upon a time, dash would have been > like that too. > > Which is "right" doesn't seem entirely clear to me. ash, from which > those 3 are derived generally tried to copy the original Bourne sh > (7th edition, not some Solaris of Sys V version Joerg) (and not any ksh > version) for things that existed in it, so it may be that it did it this > way as well. The text in POSIX doesn't really say one way or the other.
The fact that POSIX explicitly says it's unspecified for assignments before a command that is not a special built-in or function implies that (the authors believed) it is specified for other cases. When the POSIX shell requirements were originally developed for POSIX.2-1992 the intention was to describe the behaviour of ksh88 with a few deliberate changes (as documented in the rationale, where nothing is said about a change in this area). Therefore the intention was for POSIX to require the ksh88 behaviour here, so I believe that's the "right" behaviour even if the standard is not entirely clear it is required. > | The one that's in effect when the <<-EOF redirection is evaluated, > | i.e. /some/file. Again, all the shells I tried do it that way. > > In that situation the FreeBSD shell writes to whatever fd 3 was before > the command... > > fbsh $ exec 3>/tmp/bar > fbsh $ cat 3>/tmp/foo <<EOF > text > $(echo foo >&3) > EOF > text > > fbsh $ cat /tmp/foo > fbsh $ cat /tmp/bar > foo And this must be another case where dash changed to match other shells. > | The entire script is being split into tokens. 2.3 says "When an > | io_here token has been recognized by the grammar (see Section 2.10), > | one or more of the subsequent lines immediately following the next > | NEWLINE token form the body of one or more here-documents and shall be > | parsed according to the rules of Section 2.7.4." > | 2.7.4 says "The here-document shall be treated as a single word that > | begins after the next <newline> and continues until there is a line > | containing only the delimiter and a <newline>, with no <blank> > | characters in between." > > Sure, but that still doesn't jump up and say to me that the \newlines > must be removed before the end token is recognised. Doing so means that > there can't be one simple common routine for collecting here doc data, > as whether that happens depends on whether the end word was quoted or > not. > > And like I said, this is a case where there aren't just a couple of perhaps > outlier shells doing it what you believe is the "wrong" way, ignoring ksh93 > (which would probably be on your side of the fence) this is a case where > there's about a 50-50 split between shells. > > Of course, in real life, no-one puts \newlines in the middle of the end > delimiter line, so this makes not one jot of difference to anything that > matters. Agreed, but it matters to avoid prematurely finding the delimiter when it occurs on a continuation line and so should be part of the here-doc (see below). Getting that right means that <backslash><newline> within the delimiter would naturally also be handled correctly. > | So the requirement from 2.2.1 is that <backslash><newline> is removed > | before the here-doc is tokenised as a single word as per 2.3 and 2.7.4. > > And it seems that lots of people don't jump to that conclusion (one that > doesn't is yash, which is supposed to be an implementation written using > the standard as its rulebook - unlike most of the others which mostly just > copy some other implementations behaviour, in many cases of course because > their origins are from before any standard existed). > > yash $ cat <<EOF > > foo > > E\ > > O\ > > F > > EOF > foo > EOF > yash $ > > (Sorry, I haven't turned off PS2 in my yash shell setup). All 3 ash > derived shells are the same, so probably the original Bourne sh was too. > > Note that things aren't as clear as you make out, as the end delimiter > is not part of the here doc word, and so isn't subject to its content > or parsing rules (it is not part of the double quoted, more or less, word). But, as I mentioned above, if the shell looks for the delimiter before it removes <backslash><newline> instead of after, it will get this case wrong: $ cat <<EOF > foo\ > EOF > EOF fooEOF $ The foo\ and the first EOF line are required to be parsed as part of the here-doc. The second EOF line is then the delimiter. -- Geoff Clare <[email protected]> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
