Geoff Clare via austin-group-l at The Open Group wrote in
 <Z_5vwwvojBKpOJl7@localhost>:
 |Chet Ramey wrote, on 11 Apr 2025:
 |> The question is whether shells will agree on *which* empty fields are
 |> discarded. 
 |[...]
 |> For everyone who isn't on the bash mailing list, this is the message
 |> from kre that Steffen is referring to that explains the bash and NetBSD
 |> sh behavior.
 |> 
 |> https://lists.gnu.org/archive/html/bug-bash/2025-03/msg00030.html
 |
 |Okay, with Chet's clue and kre's detailed description, I now see that

Disclaimer: i will agree with your conclusion below.

*But*.  No, his description is not right.  That is, compliant to
the standard as it stands.  Or what  I firstly say

    | I find this logical since before the
    | resplit we have ":a:" + "a" + "" + "a", and the trailing ":" in
    | the first only delimits the field of "a",

then he says

  The ':' really has little to do with it, the space is a data
  char, it isn't in IFS, so isn't going to be touched by field
  splitting, and by this time, shell parsing is long done, so its
  tokenisation (which drops unquoted whitespace normally) is no
  longer relevant either.

but the standard says

    * Expands to the positional parameters, starting from one,
      initially producing one field for each positional parameter that
      is set. When the expansion occurs in a context where field
      splitting will be performed, any empty fields may be discarded
      and each of the non-empty fields shall be further split as
      described in Section 2.6.5.

and 2.6.5 describes that, of course, IFS is the thing we have to
go for.  And : *is* $IFS.  So of course it *is* where the split
occurs, and that means that ":a:" is split into ''+a, and not into
''+a+''.  There simply is no such field.

*That* field *only* exists in the quoted variant ie

              echo $#,'*'="$*"/$*,

    -4,*=:a::a::a/ a  a  a,$

^bash, NetBSD*

    +4,*=:a::a::a/ a a  a,$

^my one without compat hack.
Quoted, because the standard says, adjacently to the above

      When the expansion occurs in a context where field splitting
      will not be performed, the initial fields shall be joined to
      form a single field with the value of each parameter
      separated by the first character of the IFS variable if IFS
      contains at least one character, or separated by a <space>
      if IFS is unset, or with no separation if IFS is set to
      a null string.

*This*, and only this, results in ":a::a::a".  And only *that*
results in the result of bash when the standard algorithm for
field splitting is then applied.

 |my expectation of "empty fields may be discarded" leading to exactly
 |two conforming behaviours was wrong.  I had naïvely assumed that each
 |shell would either discard all empty fields or retain all empty fields.
 |
 |According to kre, bash only discards one trailing empty field and
 |retains all others, and mksh discards leading and trailing empty
 |fields but retains intermediate ones.

Where do you see any field that can be retained?
I really would like to become enlightened.

 |These subtleties can easily account for the four different behaviours
 |of ksh88, ksh93/dash, bash, and mksh.
 |
 |So I'm now of the opinion that no change is needed to the normative
 |text, but it might be worth adding some informative text pointing out
 |that "empty fields may be discarded" doesn't mean that shells can
 |only either discard all empty fields or retain all empty fields.

'Thing is that there is no shell in existence that i know that
honours the standard at all!  At least not if the first character
of $IFS is not whitespace.  Everything is just fine until then,
bash and NetBSD sh and NetBSD ksh act just like the standard says,
except if the first character of $IFS is not whitespace.
Therefore my code that complies exactly with POSIX words gets the
exact results of the mentioned three shells with the hack

                        /* In order to be compatible with bash, NetBSD sh and 
NetBSD ksh, at minimum, we need to
                         * deviate from POSIX standardized behavior, and field 
split the quoted variant instead!
                         * This applies to $@ as well as $*, and their 
result-set variants */
                        if(/**spcp->spc_ifs != '\0' &&*/ 
!su_cs_is_space(*spcp->spc_ifs)){
                                cp = n_var_vlook((!rset ? n_star : "^*"), TRU1);
^aka cp = get-quoted-value-of("$*");
                                goto jfs_split;
                        }

 |-- 
 |Geoff Clare <g.cl...@opengroup.org>
 |The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
 --End of <Z_5vwwvojBKpOJl7@localhost>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

  • Call for input on 2.... Andrew Josey via austin-group-l at The Open Group
    • Re: Call for in... Geoff Clare via austin-group-l at The Open Group
      • Re: Call fo... Steffen Nurpmeso via austin-group-l at The Open Group
        • Re: Cal... Chet Ramey via austin-group-l at The Open Group
          • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
            • ... Chet Ramey via austin-group-l at The Open Group
              • ... Steffen Nurpmeso via austin-group-l at The Open Group
          • Re:... Geoff Clare via austin-group-l at The Open Group
            • ... Steffen Nurpmeso via austin-group-l at The Open Group
              • ... Geoff Clare via austin-group-l at The Open Group
                • ... Steffen Nurpmeso via austin-group-l at The Open Group
                • ... Geoff Clare via austin-group-l at The Open Group

Reply via email to