[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2018-05-10 Thread Austin Group Bug Tracker

The following issue has been set as DUPLICATE OF issue 0001085. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: Resolved
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
Resolution: Duplicate
Duplicate:  0
Fixed in Version:   
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2018-05-10 15:52 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
==
Relationships   ID  Summary
--
duplicate of0001085 "token shall be from the current p...
related to  0001100 Rewrite of Section 2.10 Shell Grammar, ...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
2016-10-14 22:16 shware_systems Note Added: 0003416  
2016-10-15 01:31 Mark_GaleckNote Added: 0003417  
2016-10-25 13:04 shware_systems Note Added: 0003456  
2016-10-28 08:20 geoffclare Relationship added   related to 0001100  
2018-05-10 15:52 geoffclare Interp Status => --- 
2018-05-10 15:52 geoffclare Note Added: 0004028  
2018-05-10 15:52 geoffclare Status   New => Resolved 
2018-05-10 15:52 geoffclare Resolution   Open => Duplicate   
2018-05-10 15:52 geoffclare Relationship added   duplicate of 0001085
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2018-05-10 Thread Austin Group Bug Tracker

The following issue has been RESOLVED. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: Resolved
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
Resolution: Duplicate
Duplicate:  0
Fixed in Version:   
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2018-05-10 15:52 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
==
Relationships   ID  Summary
--
related to  0001100 Rewrite of Section 2.10 Shell Grammar, ...
== 

-- 
 (0004028) geoffclare (manager) - 2018-05-10 15:52
 http://austingroupbugs.net/view.php?id=1084#c4028 
-- 
This is being resolved as a duplicate of
http://austingroupbugs.net/view.php?id=1085 because we believe the
resolution of 1085 addresses this problem as well. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
2016-10-14 22:16 shware_systems Note Added: 0003416  
2016-10-15 01:31 Mark_GaleckNote Added: 0003417  
2016-10-25 13:04 shware_systems Note Added: 0003456  
2016-10-28 08:20 geoffclare Relationship added   related to 0001100  
2018-05-10 15:52 geoffclare Interp Status => --- 
2018-05-10 15:52 geoffclare Note Added: 0004028  
2018-05-10 15:52 geoffclare Status   New => Resolved 
2018-05-10 15:52 geoffclare Resolution   Open => Duplicate   
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-28 Thread Austin Group Bug Tracker

The following issue has been set as RELATED TO issue 0001100. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-28 08:20 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
==
Relationships   ID  Summary
--
related to  0001100 Rewrite of Section 2.10 Shell Grammar, ...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
2016-10-14 22:16 shware_systems Note Added: 0003416  
2016-10-15 01:31 Mark_GaleckNote Added: 0003417  
2016-10-25 13:04 shware_systems Note Added: 0003456  
2016-10-28 08:20 geoffclare Relationship added   related to 0001100  
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-25 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-25 13:04 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
== 

-- 
 (0003456) shware_systems (reporter) - 2016-10-25 13:04
 http://austingroupbugs.net/view.php?id=1084#c3456 
-- 
I am explaining how the standard can be interpreted so what's there does
match existing behaviors and not arguing, per your point 1. as something
questionable just looking at the normative text. I'm more right than not
here, whether that's easy to believe or not, as someone that has been
involved in the more recent changes to that text. I leave it as my opinion
because it's easy enough to overlook nuances intended by the original
authors of those sections, some of whom (if not all) are still involved
with the list, so I do not speak for them.

Note the 'same current char' and 'may terminate preceding tokens' clauses I
used. The basic loop isn't:
while (*curchar++!=EOL) {apply a single rule};

it's more a:
if (not empty input) do {apply rules, maybe recursively, and at EOL
applying the grammar to see if maybe io_here bodies need to be processed,
and then possibly reapplying rules due to detected alias expansions, again
potentially with recursion} until (*curchar==EOI); 

one. It is the individual rules that say when curchar++ may be executed,
possibly as part of a sub-loop specific to that rule or to do look-ahead
checking, such as for on-the-fly line joining when *curchar=='\'. The
standard leaves open *curchar may be referencing an input buffer where line
joining has been preprocessed as much as practical as well.

Rule 10 does not say use the current char as first character of a new token
and access a new char as current char, just to use it as first char.
 
Rule 3 does apply to terminate, per above, the '>' token of '>foobar', but
it does not necessarily start a new token. Rule 3 applies for '>#foobar'
also, as terminating '>', but Rule 9 is what determines how the rest of
that line is classified, with '#' as the current char, as beginning a
comment.

The delimiting newline Rule 9 says to look for and move past can also be
overridden by Rule 1 if the comment is on the last line of a file that
isn't terminated by an EOL. It also applies if the last character is a NUL,
if the source is a C string as a sh -c argument or system() interface call,
or a Ctrl-Z if that's the interactive EOF control character, as variants
all symbolically EOI.

For 'foo'bar'baz' the first ' gets classified by Rule 10, and then by Rule
4 as the beginning of a single quoted string. A new current char, the 'f',
is then accessed according to that rule.
 
For the last case, Rule 10 starts the token, $$ is a valid special
parameter (the shells' numeric pid after evaluation) by Rule 5 and 2.5.2,
and '#' by Rule 10 followed by Rule 9 again begins a comment.

No, it's not straightforward, but it is essentially correct as is in
describing how various implementations process most scripts. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409 

[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-14 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-15 01:31 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
== 

-- 
 (0003417) Mark_Galeck (reporter) - 2016-10-15 01:31
 http://austingroupbugs.net/view.php?id=1084#c3417 
-- 
Before I reply, I want to clarify something.  It is not my intention to
argue on any of my reports - I will accept any reply and resolution , even
if I disagree personally, so long as:

1.  The reply answers questions that were left unanswered in the current
standard, as explained in the report.  This includes pointing to a place in
the standard, if I missed it.

2.  The reply states the facts correctly that are in the standard and does
not contradict the standard.  


So, you don't have to spend your valuable time arguing your points with me.
 I will take them as granted so long as they satisfy 1 and 2 above.  All
you have to do is satisfy 1 and 2 and that is it.  


This reply IMHO does not satisfy condition 2.  I will explain why:

>Rule 3, 4, 5 basically apply once a token is started
No that is impossible, the whole thing would not work if that were the
case.

  Rule 3 must apply in a case like 

>foobar

When the current character is "f", Rule 3 must apply.  If not, and as you
say, Rule 10 applied, then the previous token would not be delimited,
because Rule 10 does not say to delimit it.

Rule 4 must apply in a case like

'foo'bar'baz'

if it did not and only Rule 10 applied, then Rule 10 would start the word,
and then after that, we have a current token, so when the second ' is seen,
Rule 4 would apply and according to that rule the text 'bar' would be
quoted, which is false.

Rule 5 must apply in a case like

$$#

If Rule 10 applied at the first $ , then Rule 5 would apply at the second
$, identifying '$#' as a parameter expansion, elsewhere in the standard it
is explained that would yield the string '0', and then we would have the
simple command '$0'.  But that is not the intent of the standard and that
is not how existing shells behave:  they yield the simple command such as
'7689#'. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
2016-10-14 22:16 shware_systems Note Added: 0003416  
2016-10-15 01:31 Mark_GaleckNote Added: 0003417  
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-14 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-14 22:16 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
== 

-- 
 (0003416) shware_systems (reporter) - 2016-10-14 22:16
 http://austingroupbugs.net/view.php?id=1084#c3416 
-- 
Rule 3, 4, 5 basically apply once a token is started, which isn't the case
for the circumstance cited, so Rule 10 establishes that a word token is
expected, as the first fully applicable rule, then Rule 3, 4, 5 applies
with the same current char and the token being started as possibly a word,
is how I read it. They may force a previous token to terminate but do not
force the start of a token because a sequence like "" amounts to empty text
that will get subsequently discarded anyways as being not materially part
of the token during quote removal, and an implementation employing limited
look-ahead can safely preevaluate and skip over sequences like that as an
optimization. On some platforms such optimizations aren't practical either,
for architectural reasons.

Granted, as stated this isn't immediately intuitive, but it has to cover
recursive evaluation of tokens inside tokens too, and potential side
effects of alias processing. A precondition, charclasses, postcondition
matrix would have a lot more rows than the rules, however, especially if
the additional rules incorporated by reference to 2.6 are added for the
various substitution types.

It is last resort because the other rules handle whether: 
  the char starts an operator token, 
  is text that should be ignored without starting a token, or 
  is an interruption of the recognition context of a word token that has
non-discardable parts already, 
as higher priority evaluations that may terminate preceding tokens as side
effects. Rule 1 is first, I believe, because it has the additional side
effect of completing the top level productions of the grammar, as highest
priority. If it can't be one of those, in other words, then by default it
must be the potential start of a word token. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
2016-10-14 22:16 shware_systems Note Added: 0003416  
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-12 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-13 01:44 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
== 

-- 
 (0003409) Mark_Galeck (reporter) - 2016-10-13 01:44
 http://austingroupbugs.net/view.php?id=1084#c3409 
-- 
No it's not.  That is my whole point.  I repeat, line 74749 says "applying
the _first_ applicable rule".  If rule 3, 4, 5 applies, that is it, there
is no "rule of last resort". 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
2016-10-13 01:44 Mark_GaleckNote Added: 0003409  
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-12 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-12 23:29 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
== 

-- 
 (0003408) shware_systems (reporter) - 2016-10-12 23:29
 http://austingroupbugs.net/view.php?id=1084#c3408 
-- 
This is handled by Rule 10., as rule of last resort. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
2016-10-12 23:29 shware_systems Note Added: 0003408  
==




[1003.1(2016)/Issue7+TC2 0001084]: rule 3, 4, 5 do not say that a token is started, if needed

2016-10-12 Thread Austin Group Bug Tracker

The following issue has been SUBMITTED. 
== 
http://austingroupbugs.net/view.php?id=1084 
== 
Reported By:Mark_Galeck
Assigned To:
== 
Project:1003.1(2016)/Issue7+TC2
Issue ID:   1084
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: New
Name:   Mark Galeck 
Organization:
User Reference:  
Section:2.3 Token Recognition 
Page Number:2347-2348 
Line Number:74761-74780 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2016-10-12 08:56 UTC
Last Modified:  2016-10-12 08:56 UTC
== 
Summary:rule 3, 4, 5 do not say that a token is started, if
needed
Description: 
It says in line 74749 "break its input into tokens by applying the
first applicable rule", but if rule 3, 4, or 5 is the first applicable and
a token is intended to start at the current character, then that token is
never started.
Desired Action: 
Add to the end of rule 3: "In addition to this rule, continue to the next
applicable rule for the current character, to decide if the current
character starts a new token or is discarded."

Add to the ends of rule 4 and 5: "If there is no current token, then the
current character starts a token".  
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-10-12 08:56 Mark_GaleckNew Issue
2016-10-12 08:56 Mark_GaleckName  => Mark Galeck 
2016-10-12 08:56 Mark_GaleckSection   => 2.3 Token
Recognition
2016-10-12 08:56 Mark_GaleckPage Number   => 2347-2348   
2016-10-12 08:56 Mark_GaleckLine Number   => 74761-74780 
==