[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
The following issue has been set as DUPLICATE OF issue 0001085. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: Resolved Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: Resolution: Duplicate Duplicate: 0 Fixed in Version: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2018-05-10 15:52 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == Relationships ID Summary -- duplicate of0001085 "token shall be from the current p... related to 0001100 Rewrite of Section 2.10 Shell Grammar, ... == Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 2016-10-14 22:16 shware_systems Note Added: 0003416 2016-10-15 01:31 Mark_GaleckNote Added: 0003417 2016-10-25 13:04 shware_systems Note Added: 0003456 2016-10-28 08:20 geoffclare Relationship added related to 0001100 2018-05-10 15:52 geoffclare Interp Status => --- 2018-05-10 15:52 geoffclare Note Added: 0004028 2018-05-10 15:52 geoffclare Status New => Resolved 2018-05-10 15:52 geoffclare Resolution Open => Duplicate 2018-05-10 15:52 geoffclare Relationship added duplicate of 0001085 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
The following issue has been RESOLVED. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: Resolved Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: Resolution: Duplicate Duplicate: 0 Fixed in Version: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2018-05-10 15:52 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == Relationships ID Summary -- related to 0001100 Rewrite of Section 2.10 Shell Grammar, ... == -- (0004028) geoffclare (manager) - 2018-05-10 15:52 http://austingroupbugs.net/view.php?id=1084#c4028 -- This is being resolved as a duplicate of http://austingroupbugs.net/view.php?id=1085 because we believe the resolution of 1085 addresses this problem as well. Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 2016-10-14 22:16 shware_systems Note Added: 0003416 2016-10-15 01:31 Mark_GaleckNote Added: 0003417 2016-10-25 13:04 shware_systems Note Added: 0003456 2016-10-28 08:20 geoffclare Relationship added related to 0001100 2018-05-10 15:52 geoffclare Interp Status => --- 2018-05-10 15:52 geoffclare Note Added: 0004028 2018-05-10 15:52 geoffclare Status New => Resolved 2018-05-10 15:52 geoffclare Resolution Open => Duplicate ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
The following issue has been set as RELATED TO issue 0001100. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-28 08:20 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == Relationships ID Summary -- related to 0001100 Rewrite of Section 2.10 Shell Grammar, ... == Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 2016-10-14 22:16 shware_systems Note Added: 0003416 2016-10-15 01:31 Mark_GaleckNote Added: 0003417 2016-10-25 13:04 shware_systems Note Added: 0003456 2016-10-28 08:20 geoffclare Relationship added related to 0001100 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-25 13:04 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == -- (0003456) shware_systems (reporter) - 2016-10-25 13:04 http://austingroupbugs.net/view.php?id=1084#c3456 -- I am explaining how the standard can be interpreted so what's there does match existing behaviors and not arguing, per your point 1. as something questionable just looking at the normative text. I'm more right than not here, whether that's easy to believe or not, as someone that has been involved in the more recent changes to that text. I leave it as my opinion because it's easy enough to overlook nuances intended by the original authors of those sections, some of whom (if not all) are still involved with the list, so I do not speak for them. Note the 'same current char' and 'may terminate preceding tokens' clauses I used. The basic loop isn't: while (*curchar++!=EOL) {apply a single rule}; it's more a: if (not empty input) do {apply rules, maybe recursively, and at EOL applying the grammar to see if maybe io_here bodies need to be processed, and then possibly reapplying rules due to detected alias expansions, again potentially with recursion} until (*curchar==EOI); one. It is the individual rules that say when curchar++ may be executed, possibly as part of a sub-loop specific to that rule or to do look-ahead checking, such as for on-the-fly line joining when *curchar=='\'. The standard leaves open *curchar may be referencing an input buffer where line joining has been preprocessed as much as practical as well. Rule 10 does not say use the current char as first character of a new token and access a new char as current char, just to use it as first char. Rule 3 does apply to terminate, per above, the '>' token of '>foobar', but it does not necessarily start a new token. Rule 3 applies for '>#foobar' also, as terminating '>', but Rule 9 is what determines how the rest of that line is classified, with '#' as the current char, as beginning a comment. The delimiting newline Rule 9 says to look for and move past can also be overridden by Rule 1 if the comment is on the last line of a file that isn't terminated by an EOL. It also applies if the last character is a NUL, if the source is a C string as a sh -c argument or system() interface call, or a Ctrl-Z if that's the interactive EOF control character, as variants all symbolically EOI. For 'foo'bar'baz' the first ' gets classified by Rule 10, and then by Rule 4 as the beginning of a single quoted string. A new current char, the 'f', is then accessed according to that rule. For the last case, Rule 10 starts the token, $$ is a valid special parameter (the shells' numeric pid after evaluation) by Rule 5 and 2.5.2, and '#' by Rule 10 followed by Rule 9 again begins a comment. No, it's not straightforward, but it is essentially correct as is in describing how various implementations process most scripts. Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-15 01:31 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == -- (0003417) Mark_Galeck (reporter) - 2016-10-15 01:31 http://austingroupbugs.net/view.php?id=1084#c3417 -- Before I reply, I want to clarify something. It is not my intention to argue on any of my reports - I will accept any reply and resolution , even if I disagree personally, so long as: 1. The reply answers questions that were left unanswered in the current standard, as explained in the report. This includes pointing to a place in the standard, if I missed it. 2. The reply states the facts correctly that are in the standard and does not contradict the standard. So, you don't have to spend your valuable time arguing your points with me. I will take them as granted so long as they satisfy 1 and 2 above. All you have to do is satisfy 1 and 2 and that is it. This reply IMHO does not satisfy condition 2. I will explain why: >Rule 3, 4, 5 basically apply once a token is started No that is impossible, the whole thing would not work if that were the case. Rule 3 must apply in a case like >foobar When the current character is "f", Rule 3 must apply. If not, and as you say, Rule 10 applied, then the previous token would not be delimited, because Rule 10 does not say to delimit it. Rule 4 must apply in a case like 'foo'bar'baz' if it did not and only Rule 10 applied, then Rule 10 would start the word, and then after that, we have a current token, so when the second ' is seen, Rule 4 would apply and according to that rule the text 'bar' would be quoted, which is false. Rule 5 must apply in a case like $$# If Rule 10 applied at the first $ , then Rule 5 would apply at the second $, identifying '$#' as a parameter expansion, elsewhere in the standard it is explained that would yield the string '0', and then we would have the simple command '$0'. But that is not the intent of the standard and that is not how existing shells behave: they yield the simple command such as '7689#'. Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 2016-10-14 22:16 shware_systems Note Added: 0003416 2016-10-15 01:31 Mark_GaleckNote Added: 0003417 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-14 22:16 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == -- (0003416) shware_systems (reporter) - 2016-10-14 22:16 http://austingroupbugs.net/view.php?id=1084#c3416 -- Rule 3, 4, 5 basically apply once a token is started, which isn't the case for the circumstance cited, so Rule 10 establishes that a word token is expected, as the first fully applicable rule, then Rule 3, 4, 5 applies with the same current char and the token being started as possibly a word, is how I read it. They may force a previous token to terminate but do not force the start of a token because a sequence like "" amounts to empty text that will get subsequently discarded anyways as being not materially part of the token during quote removal, and an implementation employing limited look-ahead can safely preevaluate and skip over sequences like that as an optimization. On some platforms such optimizations aren't practical either, for architectural reasons. Granted, as stated this isn't immediately intuitive, but it has to cover recursive evaluation of tokens inside tokens too, and potential side effects of alias processing. A precondition, charclasses, postcondition matrix would have a lot more rows than the rules, however, especially if the additional rules incorporated by reference to 2.6 are added for the various substitution types. It is last resort because the other rules handle whether: the char starts an operator token, is text that should be ignored without starting a token, or is an interruption of the recognition context of a word token that has non-discardable parts already, as higher priority evaluations that may terminate preceding tokens as side effects. Rule 1 is first, I believe, because it has the additional side effect of completing the top level productions of the grammar, as highest priority. If it can't be one of those, in other words, then by default it must be the potential start of a word token. Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 2016-10-14 22:16 shware_systems Note Added: 0003416 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-13 01:44 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == -- (0003409) Mark_Galeck (reporter) - 2016-10-13 01:44 http://austingroupbugs.net/view.php?id=1084#c3409 -- No it's not. That is my whole point. I repeat, line 74749 says "applying the _first_ applicable rule". If rule 3, 4, 5 applies, that is it, there is no "rule of last resort". Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 2016-10-13 01:44 Mark_GaleckNote Added: 0003409 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
A NOTE has been added to this issue. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-12 23:29 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed == -- (0003408) shware_systems (reporter) - 2016-10-12 23:29 http://austingroupbugs.net/view.php?id=1084#c3408 -- This is handled by Rule 10., as rule of last resort. Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 2016-10-12 23:29 shware_systems Note Added: 0003408 ==
[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed
The following issue has been SUBMITTED. == http://austingroupbugs.net/view.php?id=1084 == Reported By:Mark_Galeck Assigned To: == Project:1003.1(2016)/Issue7+TC2 Issue ID: 1084 Category: Shell and Utilities Type: Error Severity: Editorial Priority: normal Status: New Name: Mark Galeck Organization: User Reference: Section:2.3 Token Recognition Page Number:2347-2348 Line Number:74761-74780 Interp Status: --- Final Accepted Text: == Date Submitted: 2016-10-12 08:56 UTC Last Modified: 2016-10-12 08:56 UTC == Summary:rule 3, 4, 5 do not say that a token is started, if needed Description: It says in line 74749 "break its input into tokens by applying the first applicable rule", but if rule 3, 4, or 5 is the first applicable and a token is intended to start at the current character, then that token is never started. Desired Action: Add to the end of rule 3: "In addition to this rule, continue to the next applicable rule for the current character, to decide if the current character starts a new token or is discarded." Add to the ends of rule 4 and 5: "If there is no current token, then the current character starts a token". == Issue History Date ModifiedUsername FieldChange == 2016-10-12 08:56 Mark_GaleckNew Issue 2016-10-12 08:56 Mark_GaleckName => Mark Galeck 2016-10-12 08:56 Mark_GaleckSection => 2.3 Token Recognition 2016-10-12 08:56 Mark_GaleckPage Number => 2347-2348 2016-10-12 08:56 Mark_GaleckLine Number => 74761-74780 ==