Harald van Dijk <a...@gigawatt.nl> wrote, on 27 Jan 2019:
>
> On 18/01/2019 11:48, Austin Group Bug Tracker wrote:
> >----------------------------------------------------------------------
> >  (0004214) geoffclare (manager) - 2019-01-18 11:48
> >  http://austingroupbugs.net/view.php?id=953#c4214
> >----------------------------------------------------------------------
> >Alternative resolution, based on kre's suggestion on the mailing list.
[...]
> A corner case which should probably remain unspecified is when the resulting
> token partially results from an alias substitution of the same alias name at
> any earlier recursion level:
> 
>   alias echo='ec\'
>   echo
>   ho hello
> 
> Shells vary in how they treat this and I cannot see any value in requiring
> any particular behaviour here: no sensible script is going to rely on this.
> It could be left implicitly unspecified by the current proposed wording, but
> could alternatively be made explicitly unspecified by something along the
> lines of
> 
>   [...]
>   the TOKEN did not fully result from an alias substitution of the same
> alias name at any earlier recursion level, and
>   optionally, the TOKEN did not partially result from an alias substitution
> of the same alias name at any earlier recursion level, and
>   [...]

I think there needs to be something explicit here.  I'd prefer to try
and word it without so much repetition.  Perhaps:

    the TOKEN did not either fully or, optionally, partially result from
    an alias substitution of the same alias name at any earlier recursion
    level, and

> >               If it does not add this <space>, and the last character of
> >the alias value could be part of an operator token, it is unspecified
> >whether the current token is delimited before token recognition is applied
> >to the character (if any) that followed the <b>TOKEN</b> in the input.
> 
> Limiting this to operator tokens will mean, I believe, that in other cases
> (unterminated words, heredocs, comments) shells are required to treat the
> subsequent characters as part of whatever the alias substitution resulted
> in. Is that intentional? Many shells have some corner cases where the end of
> an alias substitution delimits a word:
> 
>   alias foo='echo $'
>   foo((123))

According to the current wording of note 4214 the foo here is not
subject to alias substitution (because of the last condition in the
bullet list, "the TOKEN will be parsed as the command name word of a
simple command ...")

Since foo((123)) is a syntax error as far as the standard grammar is
concerned, shells can either report a syntax error or accept the syntax
as an extension (with unspecified results).

> When this is combined with heredocs, shells again behave differently:
> 
>   alias foo='cat <<EOF
>   $'
>   foo((123))
>   EOF

Again, foo((123)) is a syntax error as far as the standard grammar is
concerned.

> I think the text can be reduced to
> 
>   If it does not add this <space>, it is unspecified whether the current
>   token is delimited [...]
> 
> to let that be unspecified. This covers heredocs as well: they are specified
> to be treated as a single word, so I would say they implicitly have
> unspecified or undefined behaviour if the end of the alias substitution
> causes that word to be delimited. This allows shells to continue treating
> them in whatever way they do now. It does not cover comments, but I am not
> aware of (released versions of) shells that do not handle those.

I believe that within the standard grammar, unless the alias value ends
with quoting or commenting "on", the only way a command word of a simple
command can be delimited by something that can combine with the end of
the alias value to form a token, is if the command word is delimited
by &, ;, |, <, or >.  In these cases the resulting token is an
operator (&&, ;;, ||, and various redirections).

In the cases where the value ends with quoting or commenting "on", I
thought that all shells that don't add a space behave the same (or if
they don't, they are considered buggy), so by making the behaviour
unspecified only for operator tokens these cases would remain specified.

> This proposed resolution does not leave empty aliases (aliases not resulting
> in any token) unspecified.

It does if the alias is truly empty.  It says that token recognition
resumes "at the first character of the alias value".  Thus it is silent
about the behaviour when there is no first character.  However, you're
right that it should say something about other no-token values.

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Reply via email to