A NOTE has been added to this issue. 
Reported By:                Mark_Galeck
Assigned To:                
Project:                    1003.1(2016)/Issue7+TC2
Issue ID:                   1084
Category:                   Shell and Utilities
Type:                       Error
Severity:                   Editorial
Priority:                   normal
Status:                     New
Name:                       Mark Galeck 
User Reference:              
Section:                    2.3 Token Recognition 
Page Number:                2347-2348 
Line Number:                74761-74780 
Interp Status:              --- 
Final Accepted Text:         
Date Submitted:             2016-10-12 08:56 UTC
Last Modified:              2016-10-14 22:16 UTC
Summary:                    rule 3, 4, 5 do not say that a token is started, if

 (0003416) shware_systems (reporter) - 2016-10-14 22:16
Rule 3, 4, 5 basically apply once a token is started, which isn't the case
for the circumstance cited, so Rule 10 establishes that a word token is
expected, as the first fully applicable rule, then Rule 3, 4, 5 applies
with the same current char and the token being started as possibly a word,
is how I read it. They may force a previous token to terminate but do not
force the start of a token because a sequence like "" amounts to empty text
that will get subsequently discarded anyways as being not materially part
of the token during quote removal, and an implementation employing limited
look-ahead can safely preevaluate and skip over sequences like that as an
optimization. On some platforms such optimizations aren't practical either,
for architectural reasons.

Granted, as stated this isn't immediately intuitive, but it has to cover
recursive evaluation of tokens inside tokens too, and potential side
effects of alias processing. A precondition, charclasses, postcondition
matrix would have a lot more rows than the rules, however, especially if
the additional rules incorporated by reference to 2.6 are added for the
various substitution types.

It is last resort because the other rules handle whether: 
  the char starts an operator token, 
  is text that should be ignored without starting a token, or 
  is an interruption of the recognition context of a word token that has
    non-discardable parts already, 
as higher priority evaluations that may terminate preceding tokens as side
effects. Rule 1 is first, I believe, because it has the additional side
effect of completing the top level productions of the grammar, as highest
priority. If it can't be one of those, in other words, then by default it
must be the potential start of a word token. 

Issue History 
Date Modified    Username       Field                    Change               
2016-10-12 08:56 Mark_Galeck    New Issue                                    
2016-10-12 08:56 Mark_Galeck    Name                      => Mark Galeck     
2016-10-12 08:56 Mark_Galeck    Section                   => 2.3 Token
2016-10-12 08:56 Mark_Galeck    Page Number               => 2347-2348       
2016-10-12 08:56 Mark_Galeck    Line Number               => 74761-74780     
2016-10-12 23:29 shware_systems Note Added: 0003408                          
2016-10-13 01:44 Mark_Galeck    Note Added: 0003409                          
2016-10-14 22:16 shware_systems Note Added: 0003416                          

Reply via email to