The following issue has been SUBMITTED.
======================================================================
http://austingroupbugs.net/view.php?id=1198
======================================================================
Reported By: geoffclare
Assigned To:
======================================================================
Project: 1003.1(2016)/Issue7+TC2
Issue ID: 1198
Category: Shell and Utilities
Type: Error
Severity: Objection
Priority: normal
Status: New
Name: Geoff Clare
Organization: The Open Group
User Reference:
Section: awk
Page Number: 2489
Line Number: 80031
Interp Status: ---
Final Accepted Text:
======================================================================
Date Submitted: 2018-08-07 10:40 UTC
Last Modified: 2018-08-07 10:40 UTC
======================================================================
Summary: Comparison of numeric string values in awk
Description:
[The following was reported to The Open Group help desk.]
The "Expressions In Awk" section of the standard says:
---
Comparisons (with the '<', "<=", "!=", "==", '>', and ">=" operators)
shall
be made numerically if both operands are numeric, if one is numeric and
the
other has a string value that is a numeric string, or if one is numeric
and
the other has the uninitialized value. Otherwise, operands shall be
converted to strings as required and a string comparison shall be made
---
That's means that when a comparison involves 2 numeric strings it is a
string
comparison but of course in reality in all awks it is treated as a numeric
comparison so that, for example, this:
echo '5.0 10.0' | awk '$1 < $2'
evaluates to true rather than false. There are some other
confusing/misleading
statements (at best) in the standard around the type of input fields and
their
value that could be written far more clearly - see
https://groups.google.com/d/msg/comp.lang.awk/qYhgpz08pN8/9wbMr9XKCQAJ and
https://stackoverflow.com/q/51632945/1745001 for discussions that have
taken
place recently on different forums around this area.
[The comp.lang.awk discussion also talks about uninitialized fields
behaving differently than uninitialized variables in some awks when
compared to 0, but I think the standard is clear and intends what it
says, so this is simply non-conforming behaviour in those awks. I
think it is certainly undesirable for uninitialized fields to behave
differently than uninitialized variables.]
[The stackoverflow.com discussion is also about uninitialized fields
but includes an answer which points out that the expressions table has
this entry:
Syntax | Name | Type of Result | Associativity
$expr | Field reference | String | N/A
which conflicts with the descriptive text in that it implies field
variables always yield string values when used in an expression.]
Desired Action:
On page 2489 line 80031 section awk change:<blockquote>... shall be made
numerically if both operands are numeric,
...</blockquote>to:<blockquote>... shall be made numerically if both
operands are numeric, if both have string values that are numeric strings,
...</blockquote>
On page 2485 line 79876 section awk change:<blockquote>$<i>expr</i> |
Field reference | String |
N/A</blockquote>to:<blockquote>$<i>expr</i> | Field reference |
Uninitialized or string | N/A</blockquote>
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2018-08-07 10:40 geoffclare New Issue
2018-08-07 10:40 geoffclare Name => Geoff Clare
2018-08-07 10:40 geoffclare Organization => The Open Group
2018-08-07 10:40 geoffclare Section => awk
2018-08-07 10:40 geoffclare Page Number => 2489
2018-08-07 10:40 geoffclare Line Number => 80031
2018-08-07 10:40 geoffclare Interp Status => ---
======================================================================