On 18/01/2019 11:48, Austin Group Bug Tracker wrote:
----------------------------------------------------------------------
  (0004214) geoffclare (manager) - 2019-01-18 11:48
  http://austingroupbugs.net/view.php?id=953#c4214
----------------------------------------------------------------------
Alternative resolution, based on kre's suggestion on the mailing list.
[...]
If the value of the alias replacing the word ends in a <blank>, the shell
shall check the next command word for alias substitution; this process
shall continue until a word is found that is not a valid alias or an alias
value does not end in a <blank>.</blockquote>to:<blockquote>After a token
has been categorized as type <b>TOKEN</b> (see [xref to 2.10.1]), including
(recursively) any token resulting from an alias substitution, the
<b>TOKEN</b> shall be subject to alias substitution if: <ul> <li>the
<b>TOKEN</b> does not contain any quoting characters,</li> <li>the
<b>TOKEN</b> is a valid alias name (see XBD Section 3.10),</li> <li>an
alias with that name is in effect, and</li> <li>the <b>TOKEN</b> did not
result from an alias substitution of the same alias name at any earlier
recursion level,</li> </ul> except that if the <b>TOKEN</b> meets the above
conditions and would be recognized as a reserved word (see [xref to 2.4
Reserved Words]) if it occurred in an appropriate place in the input, it is
unspecified whether the <b>TOKEN</b> is subject to alias substitution.

I believe this wording also solves a problem not covered by the earlier proposed resolution:

  alias echo=echo
  echo hello

Alias recognition was specified to happen "[a]fter a token has been delimited" and "if the shell is not currently processing an alias of the same name", but after the first alias substitution, the resulting token is only delimited when the space is seen again, and at that point, the shell is no longer currently processing an alias of the same name. This proposed wording covers it by instead letting it depend on whether the token resulted from alias substitution.

A corner case which should probably remain unspecified is when the resulting token partially results from an alias substitution of the same alias name at any earlier recursion level:

  alias echo='ec\'
  echo
  ho hello

Shells vary in how they treat this and I cannot see any value in requiring any particular behaviour here: no sensible script is going to rely on this. It could be left implicitly unspecified by the current proposed wording, but could alternatively be made explicitly unspecified by something along the lines of

  [...]
the TOKEN did not fully result from an alias substitution of the same alias name at any earlier recursion level, and optionally, the TOKEN did not partially result from an alias substitution of the same alias name at any earlier recursion level, and
  [...]

               If it does not add this <space>, and the last character of
the alias value could be part of an operator token, it is unspecified
whether the current token is delimited before token recognition is applied
to the character (if any) that followed the <b>TOKEN</b> in the input.

Limiting this to operator tokens will mean, I believe, that in other cases (unterminated words, heredocs, comments) shells are required to treat the subsequent characters as part of whatever the alias substitution resulted in. Is that intentional? Many shells have some corner cases where the end of an alias substitution delimits a word:

  alias foo='echo $'
  foo((123))

When this is combined with heredocs, shells again behave differently:

  alias foo='cat <<EOF
  $'
  foo((123))
  EOF

I think the text can be reduced to

  If it does not add this <space>, it is unspecified whether the current
  token is delimited [...]

to let that be unspecified. This covers heredocs as well: they are specified to be treated as a single word, so I would say they implicitly have unspecified or undefined behaviour if the end of the alias substitution causes that word to be delimited. This allows shells to continue treating them in whatever way they do now. It does not cover comments, but I am not aware of (released versions of) shells that do not handle those.

This proposed resolution does not leave empty aliases (aliases not resulting in any token) unspecified. I mentioned them before, because they are mishandled by at least one shell:

  $ dash -c 'alias empty=
  empty'
  dash: 2: Syntax error: end of file unexpected

I'd be perfectly okay with considering that a bug in dash (I personally consider it exactly that, and it's easy to fix), but I do not know whether there are different situations in other shells that also fail for other reasons and require major changes to their implementations of aliases to solve.

Cheers,
Harald van Dijk

Reply via email to