Re: [1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
On 09/01/2019 12:29, Austin Group Bug Tracker wrote: [...] > -- (0004201) geoffclare (manager) - 2019-01-09 12:29 http://austingroupbugs.net/view.php?id=953#c4201 -- This is a proposed new resolution which addresses comments made since http://austingroupbugs.net/view.php?id=953#c3113 both here and on the mailing list. There have been a lot of comments, so if I missed anything please reply on the mailing list and (if I agree) I will edit this note. [...] On page 2351 line 74901-74904 (XCU 2.5.3 Shell Variables) change:This variable, when and only when an interactive shell is invoked, shall be subjected to parameter expansion (see Section 2.6.2) by the shell and the resulting value shall be used as a pathname of a file containing shell commands to execute in the current environment.to:This variable, when and only when an interactive shell is invoked, shall be subjected to parameter expansion (see Section 2.6.2) by the shell and the resulting value shall be used as a pathname of a file. Before any interactive commands are read, the shell shall tokenize (see [xref to XCU 2.3 Token Recognition]) the contents of the file, parse the tokens as a program (see [xref to XCU 2.10 Shell Grammar]), and execute the resulting commands in the current environment. (In other words, the contents of the ENV file are not parsed as a single compound_list, unlike the contents of a dot script. This distinction matters because it influences when aliases take effect.) This last bit was part of an earlier version, but it no longer fits now that the contents of a dot script are no longer required to be parsed as a compound_list. The rest of the comment still makes perfect sense if you take out the ", unlike the contents of a dot script" bit, I think. Cheers, Harald van Dijk
Re: [1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
Date:Wed, 9 Jan 2019 17:35:10 + From:Stephane Chazelas Message-ID: <20190109173510.xn4hdeqphbffb...@chaz.gmail.com> | I'd rather POSIX forbade applications to use "while", "until", | "do", "select", "time", etc in alias names, or leave it | unspecified whether aliases for those are expanded. A lot of what you say I think, which I believe to be mostly correct, comes down to the issue of for whom the standard is intended. As long as it is expected that the audience is script writers, then the doc should tell them what they can expect will work, and what will not (or may not) so that portable applications can be created. I think the current wording does that, as shells do actuall allow aliases to be created for keywords (in all shells for the ones that are also English words - or similar, like fi etc, and in some shells even for the others (! { ...). I don't think we are in a position to forbid anything, even if we wanted, but I assume you mean "would result in unspecified behaviour" if an application makes an alias for a keyword - I'd have no real problem with that. | but more about not requiring limitations of the original implementation when | they're not justified. If that were the objective, then the audience of the standard would need to be the shell implementors, rather than the script writers. And in that case I assume the objective would be to allow exactly what you wanted to forbid just above (though in your message, the two quoted occurred in the alternate sequence) and instead allow the shell to expand aliases that are keywords, everywhere. I'd have no problem with that either. As long as we're not explicitly covering both audiences (with different text for each when required) we cannot really do both however. Many (most, perhaps even all) shells do not allow an alias to replace an actual keyword (as distinct to a word with the same spelling used elsewhere) so we cannot suggest that it even might be OK. Nor can we tell the shells not to expand words that would be keywords when used elsewhere as currently users have the ability to do that, and we cannot break existing conforming applications. So, rock, meet the hard place... kre ps: the one incorrect (but irrelevant for your points) part of your message was the "alias 'while=until'" ... since "until" is a keyword, that's not an alias that expands to what could be a simple command, and any use of it (in any context) would be unspecified. However you made no us e anywhere of that :"until", it could have just as easily been "foo", so this is an insignificant issue.
[OT] builtin to eval code with arguments (Was: Alias implementations being invalidated by proposed new wording?)
2019-01-09 18:24:47 +0100, Joerg Schilling: [...] > They also had the idea of implementing a shell builtin that behaves like: > > sh -c cmd args > > and thus could support parameterized macros. [...] That can only really be used for "parameterized macros" that could be done as functions. In POSIX shells, you can write that builtin as a function: eval_with_args() { eval "shift; $1" } eval_with_args 'shell code' args Though more practical would be the lambdas of "es" (based on the Unix variant of "rc"): $ es -c '@ {echo $1, $2} a b c' a, b Or: $ es -c '@ x y {echo $x, $y(2)} a b c' a, c Or the anonymous functions of zsh: $ zsh -c '(){echo $1, $2} a b c' a, b But there would be little point in declaring aliases for that. You'd define normal "named" functions instead. -- Stephane
Re: [1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
One concern I have is that if I understand correctly, it *allows* application to do: alias 'while=until' (though doesn't for other keywords like "{", "!") and then *requires* implementations to expand "while" in alias 'echo_expand=echo ' echo_expand while and *requires* implementations *not* to expand "while" in while true; do ...; done Which prevents implementations from doing the kind of alias expansion done by csh or zsh (more useful IMO, as it is then similar to what the C preprocessor macro expansion does and was I beleive the original intension for aliases; that can be useful for all sorts of code instrumentation though quite limited with out parameterized aliases). Also, again, ksh88 (and so the POSIX sh of most commercial Unices) does allow "select" (a keyword, as allowed by POSIX) to be aliased. Currently, in POSIX mode, zsh doesn't do alias expansion for keywords, including in the echo_expand case. It's not about zsh, I'm sure zsh will align to whatever POSIX requires for its POSIX mode, but more about not requiring limitations of the original implementation when they're not justified. I'd rather POSIX forbade applications to use "while", "until", "do", "select", "time", etc in alias names, or leave it unspecified whether aliases for those are expanded. -- Stephane
Re: Alias implementations being invalidated by proposed new wording?
Robert Elz wrote: > | Then in 1980, former AT people that created the company "Charles River > Data > | Systems" and the first UNIX clone "UNOS" created an alias implementation > | concept that sits in the lexer and expands text. This is the most powerful > | alias concept that has been implemented for expansion in the lexer. > > That's interesting - I had the misfortune to use unos for a (short) > while, and don't recall ever knowing of that. They also had the idea of implementing a shell builtin that behaves like: sh -c cmd args and thus could support parameterized macros. > | For this reason, it it natural not to implement a special meaning for "\ > ". > | ksh88 and ksh93 seem to be the only shells that implement a special > meaning for > | "\ " here. > > Which special meaning do you refer to there? Not to expand further aliases. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
[1003.1(2016)/Issue7+TC2 0001224]: Conflict between 2.9.1 and 2.10.2 re simple command terminator
The following issue has been SUBMITTED. == http://austingroupbugs.net/view.php?id=1224 == Reported By:geoffclare Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1224 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Geoff Clare Organization: The Open Group User Reference: Section:2.9.1 Page Number:2365 Line Number:75483 Interp Status: --- Final Accepted Text: == Date Submitted: 2019-01-09 15:37 UTC Last Modified: 2019-01-09 15:37 UTC == Summary:Conflict between 2.9.1 and 2.10.2 re simple command terminator Description: Section 2.9.1 says: A ``simple command'' is a sequence of optional variable assignments and redirections, in any sequence, optionally followed by words and redirections, terminated by a control operator. This suggests that a simple command includes the terminating control operator (in the same way that a line includes the terminating ), but this conflicts with the grammar in 2.10.2 where the simple_command production does not include the terminator. Since the grammar has precedence over the text syntax description, the erroneous text "terminated by a control operator" can be removed from 2.9.1 as an editorial change without affecting the requirements of the standard. Desired Action: Delete ", terminated by a control operator". == Issue History Date ModifiedUsername FieldChange == 2019-01-09 15:37 geoffclare New Issue 2019-01-09 15:37 geoffclare Name => Geoff Clare 2019-01-09 15:37 geoffclare Organization => The Open Group 2019-01-09 15:37 geoffclare Section => 2.9.1 2019-01-09 15:37 geoffclare Page Number => 2365 2019-01-09 15:37 geoffclare Line Number => 75483 2019-01-09 15:37 geoffclare Interp Status => --- ==
Re: Alias implementations being invalidated by proposed new wording?
On 1/7/19 6:55 AM, Joerg Schilling wrote: > The way I have the teleconference in mind where we set up the new text, the > above commands causes undefined results because the shell is _allowed_ but > not > required to parse scripts as a whole under some conditions. I think Geoff's proposed resolution from today does that. The original and revised proposals for bug 953 did not. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
Robert Elz wrote, on 09 Jan 2019: > > There are just a couple of minor points that I have with your > wording, one where I think a little more clarity is needed, and > one where your wording isn't quite correct. > > > | ... change to: > > | After a TOKEN has been delimited, > > This is where I think a little extra clarity would help, and > I'd change that to be > > After a token of type TOKEN [xref XCU 2.10.1] has > been delimited, > > just to make it clear that TOKEN is a specific type of token, > and not just a weird typographical convention (it helps readers > interpret the meaning more easily). The sentence before this (at the end of 2.3) is "Once a token is delimited, it is categorized as required by the grammar in [xref to 2.10]", so I'd like to go with: After a token has been categorized as type TOKEN (see [xref to 2.10.1]), > | If the value of the alias replacing the TOKEN ends in a that would > | be unquoted after substitution, and optionally if it ends in a > that > | would be quoted after substitution, the shell shall check the next TOKEN > in > | the input for alias substitution; > > This is where the wording is incorrect, it is not the next TOKEN, which > would imply simply skipping intermediate operators, etc, but the next > token, if and only if, it is a TOKEN, that it is considered for alias > substitution. [...] > > So I would change > shall check the next TOKEN in the input > into > shall check the next token in the input, if it is a TOKEN, Good catch - I'll make that change. I've also just noticed that 2.10.1 and 2.10.2 have TOKEN in bold everywhere, so I suppose I should do the same. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: Alias implementations being invalidated by proposed new wording?
Date:Wed, 9 Jan 2019 13:27:58 + From:"Schwarz, Konrad" Message-ID: | I think it would reduce confusion if it were explicitly mandated. That is not what this group does, and not what any standards group should do - the objective is to work out what is the accepted standard, and document it, so that others can reply upon that (both to use, and produce new implementations that will satisfy users' expectations. kre
Re: [1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
Date:Wed, 9 Jan 2019 12:29:45 + From:Austin Group Bug Tracker Message-ID: <95df9c99cbc201dbbf9de3d53079d...@austingroupbugs.net> | please reply on the | mailing list and (if I agree) I will edit this note. I wish the part of all of this that really belongs in the resolution of issue 1055 had been left for that one rather than all included here, and then, one assumes also all included there - as that issue covers more than aliases, yet has the exact same issues. That said, I don't disagree with the proposed resolution of any of that issue in your new wording as it affects aliases, it just ought all be worded more generally so it applies to everything. There are just a couple of minor points that I have with your wording, one where I think a little more clarity is needed, and one where your wording isn't quite correct. | ... change to: | After a TOKEN has been delimited, This is where I think a little extra clarity would help, and I'd change that to be After a token of type TOKEN [xref XCU 2.10.1] has been delimited, just to make it clear that TOKEN is a specific type of token, and not just a weird typographical convention (it helps readers interpret the meaning more easily). | If the value of the alias replacing the TOKEN ends in a that would | be unquoted after substitution, and optionally if it ends in a that | would be quoted after substitution, the shell shall check the next TOKEN in | the input for alias substitution; This is where the wording is incorrect, it is not the next TOKEN, which would imply simply skipping intermediate operators, etc, but the next token, if and only if, it is a TOKEN, that it is considered for alias substitution. There is exactly one chance for this particular alias lookup, blink and you miss it - the very next token needs to be a valid alias, otherwise the whole thing stops.Only whitespace that produces no tokens can intervene (in the original input) between the TOKEN that was the alias with a value ending with a blank, and the prospective new alias TOKEN. Even a comment appearing between ends it, not because of the comment, which the lexer just drops, but because a comment only ends when a newline is seen, and that's an operator token, and so cannot be aliased. (Of course, usually then the next word would be looked up as an alias anyway, as the word in a command word position, but that was not because of the trailing blank.) So I would change shall check the next TOKEN in the input into shall check the next token in the input, if it is a TOKEN, kre
Re: Alias implementations being invalidated by proposed new wording?
On 1/9/19 8:27 AM, Schwarz, Konrad wrote: >> -Original Message- >> Expressly making it defined that >> alias foo='whatever \ ' >> which does end in a space (but otherwise is the exact same thing as the >> previous one) also does not expand aliases in the following >> word >> seems redundant to me. Since several shells (but not all) do expand >> aliases in this case, it seems to me the best thing to do is to leave this >> as unspecified, such that no-one sane will ever use it (if >> something is needed, just use the previous form -- but better is not to use >> aliases at all.) > > Coming from ksh, I've always understood the alias mechanism to work at the > lexical level (macro expansion with rescanning); the quoting behavior above > is the most natural in that context. > > I think it would reduce confusion if it were explicitly mandated. Regardless, the fact that existing shells do it differently is reason enough to not mandate a particular behavior. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Alias implementations being invalidated by proposed new wording?
Date:Wed, 9 Jan 2019 13:55:16 +0100 From:Joerg Schilling Message-ID: <5c35ef34.clu1godeocvhzivr%joerg.schill...@fokus.fraunhofer.de> | Well, the original Bourne Shell did not have aliases. Yes, I know that... | I believe that csh introduced an alias concept in 1979 that works completely | different from what ksh implemented much later. Yes, I think the ashell version was rather simpler, but certainly the csh aliases were quite different. | The difference, I believe is that alias expansion happens at a different | location in csh. Yes, it has the whole command line of the command containing the alias available, and can use parts of that command (or of previous commands, using the history mechanisms) as part of the generated expansion. | Then in 1980, former AT people that created the company "Charles River Data | Systems" and the first UNIX clone "UNOS" created an alias implementation | concept that sits in the lexer and expands text. This is the most powerful | alias concept that has been implemented for expansion in the lexer. That's interesting - I had the misfortune to use unos for a (short) while, and don't recall ever knowing of that. | For this reason, it it natural not to implement a special meaning for "\ ". | ksh88 and ksh93 seem to be the only shells that implement a special meaning for | "\ " here. Which special meaning do you refer to there? | > the replacement test would be, effectively | > | > "ls -cF" . | > | > (the quotes would not be there, but the ls -CF | > part would be a single 6 character word) and that | > would be the command word of the generated command. | | This does not happen as the lexer is called again and creates two word tokens | from the alias replacement. Yes, I know that, but that isn't what the current published standard says (which is why bug report 953 was filed I assume, what the standard said about aliases was nothing like reality.) | > In the example in question here, the original text is | > | > 3>&1 command | > | > and we have "alias 3=4" | "3" is not a word that is in a position of a potential command name. As far as the lexer is concerned it is. | If the lexer did parse the input in a way that does not connect "3" to the IO | redirection, It is connected, in the sense that it is an IO_NUMBER, but that is a word. | it would be alias expanded, Did you mean "not" there? | since the knowledge about "a word at a | position suitable for a command name" is not in the lexer. Yes.it is, it has to be to implement aliasing the way the standard requires (assuming that aliasing is done in the lexer, which it almost always is). It might not be available in your shell, but it generally is in others. The grammar (well, the parser, using the grammar) makes it known when it is fetching a token which could be the command name of a simple command (then the lexer uses that info to decide whether to do an alias lookup - usually the lexer also does keyword lookups as well, and returns different token names for the different keywords, but that's just an implementation choice). None of this matters as long as Geoff's recent new proposed wording as the resolution of 953 is (mostly) accepted, as all of these issues are cleaned up - it is no longer "word" that is expanded, but TOKEN, wich is a subset of word that does not include the IO_NUMBER. kre
RE: Alias implementations being invalidated by proposed new wording?
> -Original Message- > Expressly making it defined that > alias foo='whatever \ ' > which does end in a space (but otherwise is the exact same thing as the > previous one) also does not expand aliases in the following > word > seems redundant to me. Since several shells (but not all) do expand > aliases in this case, it seems to me the best thing to do is to leave this as > unspecified, such that no-one sane will ever use it (if > something is needed, just use the previous form -- but better is not to use > aliases at all.) Coming from ksh, I've always understood the alias mechanism to work at the lexical level (macro expansion with rescanning); the quoting behavior above is the most natural in that context. I think it would reduce confusion if it were explicitly mandated.
Re: Alias implementations being invalidated by proposed new wording?
Robert Elz wrote: > Date:Tue, 8 Jan 2019 23:01:05 + > From:Stephane Chazelas > Message-ID: <20190108230105.43xiupnfx4qwy...@chaz.gmail.com> > > | aliases come from csh which did not do that expansion after > | trailing blank thing. > > Actually from ashell, thje precursor of csh - that I knew, though csh > aliases were quite a different thing, closer in some respects to sh > functions than sh aliases (but like much of csh, it really all was a > bit of a mess.) Thank you for this hint, so aliases are from late 1977 already. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Alias implementations being invalidated by proposed new wording?
Robert Elz wrote: > Date:Tue, 8 Jan 2019 16:51:04 + > From:Geoff Clare > Message-ID: <20190108165104.GA31969@lt2.masqnet> > > > | Given Chet's reply, it looks like there may be more shells that do expand > | than don't. In which case I wonder why that "unquoted" text got added > | in 2016. > > I don't know the history of aliases in sh (Joerg?) it may be that they Well, the original Bourne Shell did not have aliases. I believe that csh introduced an alias concept in 1979 that works completely different from what ksh implemented much later. The difference, I believe is that alias expansion happens at a different location in csh. Then in 1980, former AT people that created the company "Charles River Data Systems" and the first UNIX clone "UNOS" created an alias implementation concept that sits in the lexer and expands text. This is the most powerful alias concept that has been implemented for expansion in the lexer. I reimplemented that concept in 1984 and added it to my (non Bourne based) shell from that time. ksh88 came up with a similar concept in the lexer. What in use at that time, but what ksh88 invented in this context is: - automated termination of alias expansion for a specific alias if this alias already has been expanded from the same source "word". - Aliases that end in a space do not switch off alias expansion for further words in the command. In Summer 2012, I added my alias implementation from 1984 to the Bourne Shell and I did this without first looking at the ksh source. Then I discovered that ksh has the two additional features mentioned above and implemented them. After that, I did some testing and did see that my alias implementation is more powerful than the one in ksh but compatible to what ksh does - except for the "\ " problem that I have not been aware of. For a descrtiption see: http://schillix.sourceforge.net/man/man1/bosh.1.html Section "Aliases" that currently start at page 7, the "alias" command currently starting at page 36 and the "unalias" command currently starting at page 68. Given that my implementation has been done without looking at ksh, it seems that what bosh and ksh do very similar is the "natural behavor" of such an implementation. > originally appeared back when any quoted character "looked different" > internally to the same character, unquoted, and that the test was just > ch == ' ' > and not > (ch & ~SQUOTE) == ' ' > > so only unquoted spaces worked.. And then that behaviour was retained > in derived shells, even after the quoting encoding method was altered, > and that those shells were the ones mostly considered when the text was > written. In former times, this was the case. Now the lexer uses a mix of wide characters (where a quoted char is still a char with the top bit on) and multi byte chars (where "\ " is propagated the way it was read). Since the location in the lexer that deals with aliases does not deal with single characters (except for the peek()ed char that was seen after the word that is going to be alias expanded), but rather uses words in strings, this is a place that uses "multi byte chars". For this reason, it it natural not to implement a special meaning for "\ ". ksh88 and ksh93 seem to be the only shells that implement a special meaning for "\ " here. > Treated literally the quoted words above would mean that if we have > > alias foo=bar > > and the input is > > foo 1 2 3 > > then the "foo" being in a command word position, and also a defined alias > would simply be replaced by the word "bar" and we'd be done. > > But that would mean that in the example in the alias page in XCU 4, > where > alias lf='ls -CF' > if the input is > > lf . > > the replacement test would be, effectively > > "ls -cF" . > > (the quotes would not be there, but the ls -CF > part would be a single 6 character word) and that > would be the command word of the generated command. This does not happen as the lexer is called again and creates two word tokens from the alias replacement. > In the example in question here, the original text is > > 3>&1 command > > and we have "alias 3=4" > > "3" is a valid alias name (alias names are not required to start > with an alpha) and when the tokeniser is run there, we are starting > in the state where we are at the command word position (you have > to assume this here from the context, but take it as a n axiom > for this example). There the first token produced by the lexer > is the IO_NUMBER "3", which is a word according to XBD 3.446, > and thus, according to the (current and proposed) spec for alias > processing, should be subject to alias replacement. "3" is not a word that is in a position of a potential command name. If the lexer did parse the input in a way that does not connect "3" to the IO redirection, it would be alias expanded, since the knowledge about
[1003.1(2013)/Issue7+TC1 0000953]: Alias expansion is under-specified
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=953 == Reported By:wpollock Assigned To:ajosey == Project:1003.1(2013)/Issue7+TC1 Issue ID: 953 Category: Shell and Utilities Type: Clarification Requested Severity: Objection Priority: normal Status: Interpretation Required Name: Wayne Pollock Organization: User Reference: Section:2.3.1 Alias Substitution Page Number:2322 Line Number:73690-73705 Interp Status: Pending Final Accepted Text:See http://austingroupbugs.net/view.php?id=953#c3113 == Date Submitted: 2015-06-04 00:22 UTC Last Modified: 2019-01-09 12:29 UTC == Summary:Alias expansion is under-specified == Relationships ID Summary -- related to 736 grammatically accept zero or more Shell... related to 0001048 deprecate alias and unalias related to 0001055 unspecified how much is parsed before e... == -- (0004201) geoffclare (manager) - 2019-01-09 12:29 http://austingroupbugs.net/view.php?id=953#c4201 -- This is a proposed new resolution which addresses comments made since http://austingroupbugs.net/view.php?id=953#c3113 both here and on the mailing list. There have been a lot of comments, so if I missed anything please reply on the mailing list and (if I agree) I will edit this note. All page and line numbers are for the 2016 and 2018 editions. On page 2348 line 74794-74805 (XCU 2.3.1 Alias Substitution), change:After a token has been delimited, but before applying the grammatical rules in Section 2.10, a resulting word that is identified to be the command name word of a simple command shall be examined to determine whether it is an unquoted, valid alias name. However, reserved words in correct grammatical context shall not be candidates for alias substitution. A valid alias name (see XBD Section 3.10) shall be one that has been defined by the alias utility and not subsequently undefined using unalias. Implementations also may provide predefined valid aliases that are in effect when the shell is invoked. To prevent infinite loops in recursive aliasing, if the shell is not currently processing an alias of the same name, the word shall be replaced by the value of the alias; otherwise, it shall not be replaced. If the value of the alias replacing the word ends in a , the shell shall check the next command word for alias substitution; this process shall continue until a word is found that is not a valid alias or an alias value does not end in a .to:After a TOKEN has been delimited, including (recursively) any token resulting from an alias substitution, the TOKEN shall be subject to alias substitution if: the TOKEN does not contain any quoting characters, the TOKEN is a valid alias name (see XBD Section 3.10), an alias with that name is in effect, the TOKEN did not result from an alias substitition of the same alias name at any earlier recursion level, the TOKEN is not recognized as a reserved word (see [xref to 2.4 Reserved Words] and the examples in [xref to XRAT C.2.3.1]), and the TOKEN will be parsed as the command name word of a simple command when the grammatical rules in Section 2.10 are applied. An implementation may defer the effect of a change to an alias but the change shall take effect no later than the completion of the currently executing complete_command (see [xref to XCU 2.10 Shell Grammar]). Changes to aliases shall not take effect out of order. Implementations may provide predefined aliases that are in effect when the shell is invoked. If the value of the alias is not a simple command (see [xref to 2.9.1]), or contains any of: a comment a variable assignment a redirection unbalanced single-quotes or double-quotes (except within a command substitution), the behavior is unspecified. When a TOKEN is subject to alias substitution, the value of the alias shall be processed to form tokens (see [xref to 2.3]) and the resulting tokens shall replace the TOKEN. If the value of the alias
Re: Alias implementations being invalidated by proposed new wording?
Date:Wed, 9 Jan 2019 10:44:00 + (UTC) From:Shware Systems Message-ID: <886936614.8618146.1547030640...@mail.yahoo.com> | Alias bodies may include entire or partial compound statements, expansions, | redirections, and unclosed strings of the <">, <'>, or <$'> sort, that depend on | or can be modified by context after the alias when that is appended to the alias body; They can (they can contain anything) - though I find it hard to imagine how what comes after the alias can modify anything that the alias generated (unless you mean token combination - the turning of '&' that ends an alias into '&&' when the delimiter character of the alias had been another '&'. But... | the addition covers the latter cases too, not <\ > only. no, it doesn't, as all those other cases have been made explicitly unspecified, and it would be bizarre for the standard to specify what should happen in a case where the results are already unspecified. This change was made in the proposed resolution of issue 953 almost 3 years ago now, and is one part of that proposed resolution which I do not believe that anyone disputes. | Your example requires a second <'> follow the alias name in all scripts | using it in that following context, because it introduces an unclosed string. No it didn't, you did not look carefully enough. I am only giving examples of uses which will remain specified in the new text - to do otherwise would be foolish (even for me). | There is no closing quote presumed at the end of an alias body, No, but there was one explicitly there. | and no implementation precludes a <;> that effectively terminates the | command the alias represents. Sorry, I have no idea what that means, or how it is relevant to the current discussion - perhaps you could give an example? | It isn't a question of whether it's sane or not to use aliases this way, it's | what is actually permitted by implementations No, what matters (here) is what is specified to work by the standard. Anything that results in unspecified behaviour we can leave for the inplementations to work out for themselves. | when the text of the recursively expanded alias body has the | following text appended to it, whether the following text comes from | lookahead tokens or the original source line. Again, I am lost trying to determine the relevance of that part. And this is from your included copy of my message (with most of it removed) | On Wednesday, January 9, 2019 Robert Elz wrote: | next word be subject to alias expansion, all that is needed is to define it | like | alias foo="whatever ' '" | and then the last character is not a space (it is a single quote) I know it is hard to read, but the following text actually said what was there, the alias value is the word "whatever" followed by an unquoted space, followed by the single quoted string ' ' (with both opening and closing quotes present). That's what causes "he last character is not a space" as stated ... it is a single quote. If that terminating quote had not been there, and the alias value had been just whatever ' (there is a space after that single quote, but nothing else) then it would not be possible to correctly parse that as a simple command, as it cotains an unterminated string (ie: syntax error), and consequently, the behaviour would be unspecified. This allows implementations to handle it however they like - which is what users who attempt this kind of thing need to deal with, as different shells process this kind of thing differently. If you haven't done so recently, you should go and review the proposed resolution of issue 953, so you understand the constraints. kre
Re: Alias implementations being invalidated by proposed new wording?
Date:Wed, 9 Jan 2019 10:05:21 + From:Geoff Clare Message-ID: <20190109100521.GC690@lt2.masqnet> | It's not obvious to me. Alias lookup has already been done for the | word in that position in the input and there is nothing to suggest | the shell has to go back and repeat it for the replacement word. There is also nothing to suggest that it should not - the lexer has very little world view, all it has is its input stream of characters, and some idea whether or not it is at a command word position - that it has previously looked up an alias for the current position is not something it will necessarily know, after all, that word has now been deleted. When it tokenises the value of the alias, nothing has changed in the state of the world, other than that we are "processing an alias" for the word that was there before (in the old way of expressing it). The only difference to what was done the previous time, is that we no longer expand that particular alias (or any others we also happen to be currently processing.) If the wording in the standard isn't making this clear, then it needs fixing so that it does, as we are after all, documenting what shells actually do, and this is something that they all do. kre
Re: Alias implementations being invalidated by proposed new wording?
Robert Elz wrote, on 09 Jan 2019: > > | This surprised me. I was previously unaware that the first word in > | the alias value is subject to recursive alias expansion. There is > | nothing in the standard to suggest this happens! > > There certainly isn't in the current (published) text, which is part of > what is wrong with it (but really that's just a part..) [...] > The proposed new wording from 953 does not have this problem, > as it is clear that the alias value is subject to tokenisation, and > when that happens, the first token (which because of the restrictions > we're placing on the value of the alias) must in a conformant script > be a word, will still be in the "command name position' (we are > yet to return anything to the grammar which could change that) and > so is "obviously" subject to alias lookup. It's not obvious to me. Alias lookup has already been done for the word in that position in the input and there is nothing to suggest the shell has to go back and repeat it for the replacement word. > | If this is because IO_NUMBER is not expanded, this change would make no > | difference to the behaviour required by the standard (because we're saying > | the behaviour is unspecified if the alias value contains a redirection). > > No, the IO_NUMBER of concern is not in the alias value (string), it is in the > original text. Okay, thanks for clarifying. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: Alias implementations being invalidated by proposed new wording?
Chet Ramey wrote, on 08 Jan 2019: > > On 1/8/19 11:51 AM, Geoff Clare wrote: > > Robert Elz wrote, on 08 Jan 2019: > >> > >> | I would prefer that we not leave it unspecified when an alias ends > >> with "\ ". > >> | If there is a shell which does recursive alias expansion in this case, > >> we > >> | should ask the authors/maintainers whether they are willing to change > >> it > >> | to behave like other shells. > >> > >> I don't know of any, but I haven't tested them either ... > > > > Given Chet's reply, it looks like there may be more shells that do expand > > than don't. In which case I wonder why that "unquoted" text got added > > in 2016. > > Based on the comments in issue 953, it happened on a phone conference, > so there's likely no record unless the etherpad still happens to exist. The relevant etherpad does still exist, but as far as I can see doesn't provide an answer to why this was added. It has a list of issues to address, which includes: * DONE should we add "unquoted" to "ends in a " so that it becomes "ends in an unquoted (after substitution)"? The "DONE" shows that is was discussed and a decision made, but the etherpad doesn't record the reasons for the decision. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: Alias implementations being invalidated by proposed new wording?
Shware Systems wrote, on 09 Jan 2019: > > On Tuesday, January 8, 2019 Robert Elz wrote: > >> ps: (and this bit might be relevant to the discussions) - it js really >> hard to imagine a use for an alias with a definition that ends "\ " >> (the only way to get a quoted space as the final char in what is >> to be the specified cases) so in practice I don't think it matters >> at all what decision is made about that one. > > Yes, the uses that were discussed are corner cases, but the consensus > was, and pretty strongly, not having it would lead to data loss > with some operators so the change was added. I don't remember which > operators were problematic at this point. This affects built-in > aliases more than ones defined in a script, as an end user may invoke > these without realizing it if the alias name is the same as a common > utility. This makes no sense. The meeting decided to make the behaviour unspecified if an alias value contains an operator, so there would be no reason to require a feature that is only useful when that is the case. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England