Geoff Clare via austin-group-l at The Open Group wrote in <Z_5vwwvojBKpOJl7@localhost>: |Chet Ramey wrote, on 11 Apr 2025: |> The question is whether shells will agree on *which* empty fields are |> discarded. |[...] |> For everyone who isn't on the bash mailing list, this is the message |> from kre that Steffen is referring to that explains the bash and NetBSD |> sh behavior. |> |> https://lists.gnu.org/archive/html/bug-bash/2025-03/msg00030.html | |Okay, with Chet's clue and kre's detailed description, I now see that
Disclaimer: i will agree with your conclusion below. *But*. No, his description is not right. That is, compliant to the standard as it stands. Or what I firstly say | I find this logical since before the | resplit we have ":a:" + "a" + "" + "a", and the trailing ":" in | the first only delimits the field of "a", then he says The ':' really has little to do with it, the space is a data char, it isn't in IFS, so isn't going to be touched by field splitting, and by this time, shell parsing is long done, so its tokenisation (which drops unquoted whitespace normally) is no longer relevant either. but the standard says * Expands to the positional parameters, starting from one, initially producing one field for each positional parameter that is set. When the expansion occurs in a context where field splitting will be performed, any empty fields may be discarded and each of the non-empty fields shall be further split as described in Section 2.6.5. and 2.6.5 describes that, of course, IFS is the thing we have to go for. And : *is* $IFS. So of course it *is* where the split occurs, and that means that ":a:" is split into ''+a, and not into ''+a+''. There simply is no such field. *That* field *only* exists in the quoted variant ie echo $#,'*'="$*"/$*, -4,*=:a::a::a/ a a a,$ ^bash, NetBSD* +4,*=:a::a::a/ a a a,$ ^my one without compat hack. Quoted, because the standard says, adjacently to the above When the expansion occurs in a context where field splitting will not be performed, the initial fields shall be joined to form a single field with the value of each parameter separated by the first character of the IFS variable if IFS contains at least one character, or separated by a <space> if IFS is unset, or with no separation if IFS is set to a null string. *This*, and only this, results in ":a::a::a". And only *that* results in the result of bash when the standard algorithm for field splitting is then applied. |my expectation of "empty fields may be discarded" leading to exactly |two conforming behaviours was wrong. I had naïvely assumed that each |shell would either discard all empty fields or retain all empty fields. | |According to kre, bash only discards one trailing empty field and |retains all others, and mksh discards leading and trailing empty |fields but retains intermediate ones. Where do you see any field that can be retained? I really would like to become enlightened. |These subtleties can easily account for the four different behaviours |of ksh88, ksh93/dash, bash, and mksh. | |So I'm now of the opinion that no change is needed to the normative |text, but it might be worth adding some informative text pointing out |that "empty fields may be discarded" doesn't mean that shells can |only either discard all empty fields or retain all empty fields. 'Thing is that there is no shell in existence that i know that honours the standard at all! At least not if the first character of $IFS is not whitespace. Everything is just fine until then, bash and NetBSD sh and NetBSD ksh act just like the standard says, except if the first character of $IFS is not whitespace. Therefore my code that complies exactly with POSIX words gets the exact results of the mentioned three shells with the hack /* In order to be compatible with bash, NetBSD sh and NetBSD ksh, at minimum, we need to * deviate from POSIX standardized behavior, and field split the quoted variant instead! * This applies to $@ as well as $*, and their result-set variants */ if(/**spcp->spc_ifs != '\0' &&*/ !su_cs_is_space(*spcp->spc_ifs)){ cp = n_var_vlook((!rset ? n_star : "^*"), TRU1); ^aka cp = get-quoted-value-of("$*"); goto jfs_split; } |-- |Geoff Clare <g.cl...@opengroup.org> |The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England --End of <Z_5vwwvojBKpOJl7@localhost> --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)