On 18/01/2019 11:48, Austin Group Bug Tracker wrote:
----------------------------------------------------------------------
(0004214) geoffclare (manager) - 2019-01-18 11:48
http://austingroupbugs.net/view.php?id=953#c4214
----------------------------------------------------------------------
Alternative resolution, based on kre's suggestion on the mailing list.
[...]
If the value of the alias replacing the word ends in a <blank>, the shell
shall check the next command word for alias substitution; this process
shall continue until a word is found that is not a valid alias or an alias
value does not end in a <blank>.</blockquote>to:<blockquote>After a token
has been categorized as type <b>TOKEN</b> (see [xref to 2.10.1]), including
(recursively) any token resulting from an alias substitution, the
<b>TOKEN</b> shall be subject to alias substitution if: <ul> <li>the
<b>TOKEN</b> does not contain any quoting characters,</li> <li>the
<b>TOKEN</b> is a valid alias name (see XBD Section 3.10),</li> <li>an
alias with that name is in effect, and</li> <li>the <b>TOKEN</b> did not
result from an alias substitution of the same alias name at any earlier
recursion level,</li> </ul> except that if the <b>TOKEN</b> meets the above
conditions and would be recognized as a reserved word (see [xref to 2.4
Reserved Words]) if it occurred in an appropriate place in the input, it is
unspecified whether the <b>TOKEN</b> is subject to alias substitution.
I believe this wording also solves a problem not covered by the earlier
proposed resolution:
alias echo=echo
echo hello
Alias recognition was specified to happen "[a]fter a token has been
delimited" and "if the shell is not currently processing an alias of the
same name", but after the first alias substitution, the resulting token
is only delimited when the space is seen again, and at that point, the
shell is no longer currently processing an alias of the same name. This
proposed wording covers it by instead letting it depend on whether the
token resulted from alias substitution.
A corner case which should probably remain unspecified is when the
resulting token partially results from an alias substitution of the same
alias name at any earlier recursion level:
alias echo='ec\'
echo
ho hello
Shells vary in how they treat this and I cannot see any value in
requiring any particular behaviour here: no sensible script is going to
rely on this. It could be left implicitly unspecified by the current
proposed wording, but could alternatively be made explicitly unspecified
by something along the lines of
[...]
the TOKEN did not fully result from an alias substitution of the same
alias name at any earlier recursion level, and
optionally, the TOKEN did not partially result from an alias
substitution of the same alias name at any earlier recursion level, and
[...]
If it does not add this <space>, and the last character of
the alias value could be part of an operator token, it is unspecified
whether the current token is delimited before token recognition is applied
to the character (if any) that followed the <b>TOKEN</b> in the input.
Limiting this to operator tokens will mean, I believe, that in other
cases (unterminated words, heredocs, comments) shells are required to
treat the subsequent characters as part of whatever the alias
substitution resulted in. Is that intentional? Many shells have some
corner cases where the end of an alias substitution delimits a word:
alias foo='echo $'
foo((123))
When this is combined with heredocs, shells again behave differently:
alias foo='cat <<EOF
$'
foo((123))
EOF
I think the text can be reduced to
If it does not add this <space>, it is unspecified whether the current
token is delimited [...]
to let that be unspecified. This covers heredocs as well: they are
specified to be treated as a single word, so I would say they implicitly
have unspecified or undefined behaviour if the end of the alias
substitution causes that word to be delimited. This allows shells to
continue treating them in whatever way they do now. It does not cover
comments, but I am not aware of (released versions of) shells that do
not handle those.
This proposed resolution does not leave empty aliases (aliases not
resulting in any token) unspecified. I mentioned them before, because
they are mishandled by at least one shell:
$ dash -c 'alias empty=
empty'
dash: 2: Syntax error: end of file unexpected
I'd be perfectly okay with considering that a bug in dash (I personally
consider it exactly that, and it's easy to fix), but I do not know
whether there are different situations in other shells that also fail
for other reasons and require major changes to their implementations of
aliases to solve.
Cheers,
Harald van Dijk