Harald van Dijk <a...@gigawatt.nl> wrote, on 27 Jan 2019: > > On 18/01/2019 11:48, Austin Group Bug Tracker wrote: > >---------------------------------------------------------------------- > > (0004214) geoffclare (manager) - 2019-01-18 11:48 > > http://austingroupbugs.net/view.php?id=953#c4214 > >---------------------------------------------------------------------- > >Alternative resolution, based on kre's suggestion on the mailing list. [...] > A corner case which should probably remain unspecified is when the resulting > token partially results from an alias substitution of the same alias name at > any earlier recursion level: > > alias echo='ec\' > echo > ho hello > > Shells vary in how they treat this and I cannot see any value in requiring > any particular behaviour here: no sensible script is going to rely on this. > It could be left implicitly unspecified by the current proposed wording, but > could alternatively be made explicitly unspecified by something along the > lines of > > [...] > the TOKEN did not fully result from an alias substitution of the same > alias name at any earlier recursion level, and > optionally, the TOKEN did not partially result from an alias substitution > of the same alias name at any earlier recursion level, and > [...]
I think there needs to be something explicit here. I'd prefer to try and word it without so much repetition. Perhaps: the TOKEN did not either fully or, optionally, partially result from an alias substitution of the same alias name at any earlier recursion level, and > > If it does not add this <space>, and the last character of > >the alias value could be part of an operator token, it is unspecified > >whether the current token is delimited before token recognition is applied > >to the character (if any) that followed the <b>TOKEN</b> in the input. > > Limiting this to operator tokens will mean, I believe, that in other cases > (unterminated words, heredocs, comments) shells are required to treat the > subsequent characters as part of whatever the alias substitution resulted > in. Is that intentional? Many shells have some corner cases where the end of > an alias substitution delimits a word: > > alias foo='echo $' > foo((123)) According to the current wording of note 4214 the foo here is not subject to alias substitution (because of the last condition in the bullet list, "the TOKEN will be parsed as the command name word of a simple command ...") Since foo((123)) is a syntax error as far as the standard grammar is concerned, shells can either report a syntax error or accept the syntax as an extension (with unspecified results). > When this is combined with heredocs, shells again behave differently: > > alias foo='cat <<EOF > $' > foo((123)) > EOF Again, foo((123)) is a syntax error as far as the standard grammar is concerned. > I think the text can be reduced to > > If it does not add this <space>, it is unspecified whether the current > token is delimited [...] > > to let that be unspecified. This covers heredocs as well: they are specified > to be treated as a single word, so I would say they implicitly have > unspecified or undefined behaviour if the end of the alias substitution > causes that word to be delimited. This allows shells to continue treating > them in whatever way they do now. It does not cover comments, but I am not > aware of (released versions of) shells that do not handle those. I believe that within the standard grammar, unless the alias value ends with quoting or commenting "on", the only way a command word of a simple command can be delimited by something that can combine with the end of the alias value to form a token, is if the command word is delimited by &, ;, |, <, or >. In these cases the resulting token is an operator (&&, ;;, ||, and various redirections). In the cases where the value ends with quoting or commenting "on", I thought that all shells that don't add a space behave the same (or if they don't, they are considered buggy), so by making the behaviour unspecified only for operator tokens these cases would remain specified. > This proposed resolution does not leave empty aliases (aliases not resulting > in any token) unspecified. It does if the alias is truly empty. It says that token recognition resumes "at the first character of the alias value". Thus it is silent about the behaviour when there is no first character. However, you're right that it should say something about other no-token values. -- Geoff Clare <g.cl...@opengroup.org> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England