https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80437
David Malcolm changed:
What|Removed |Added
CC||dmalcolm at gcc dot gnu.org
--- Comment #1 from David Malcolm ---
(In reply to Martin Sebor from comment #0)
[...snip...]
> bug.c:11:5: warning: 'memset': specified size 0xfffb exceeds
> maximum object size 0x [-Wstringop-overflow=]
>
> I'm not sure that this a significant improvement. Those already familiar
> with the -Wstringop-overflow warning will likely understand what
> 0x in this context means but only because we know the
> maximum object size limit (i.e., PTRDIFF_MAX) and realize that all printed
> values are in the [PTRDIFF_MAX + 1, SIZE_MAX] range and thus always consist
> of 16 hex digits. Someone who's seen the warning for the first time will
> either have to guess or count the f's. This is even more likely for the
> specified size (such as 0xfffb). In cases where a much lower
> limit is specified by the user (e.g., via -Walloca-larger-than) it's even
> less clear how to interpret a number in any base.
>
> I think it's possible to do better. One approach is to print very large
> values in terms of well-known constants such as SIZE_MAX or PTRDIFF_MAX.
> For instance, instead of printing 18446744073709551611 (i.e., -5) print
> SIZE_MAX - 4. Another solution might be to print sizes as signed (though
> that won't help in the case of the user-specified limit).
How about printing *both* i.e.:
bug.c:11:5: warning: 'memset': specified size 0xfffb (SIZE_MAX - 4)
exceeds maximum object size 0x (PTRDIFF_MAX)
[-Wstringop-overflow=]
(I may have got the expressions wrong, but hopefully the meaning is clear)
> Since the problem of how best to present large decimal numbers is general
> and applies to all diagnostics, including warnings, errors, and notes, a
> change to how these numbers are presented should be brought up for a wider
> discussion before it's implemented consistently, for all diagnostics.
I find large decimal numbers intimidating, and find hexadecimals easier for
values close to large powers of two.
Suggestion: choose base based on a "mental effort cost":
Example 1
*
For example, if we have an overflow that occurs when x >= 2^31,
which is easier to read:
DECIMAL:
warning: buffer overflow occurs when x >= 2147483648
HEX:
warning: buffer overflow occurs when x >= 0x8000
FORMULA:
warning: buffer overflow occurs when x >= 2^31
FORMULA and HEX:
warning: buffer overflow occurs when x >= 2^31 (0x8000)
Example 2
*
an overflow that occurs when x >= 100
DECIMAL:
warning: buffer overflow occurs when x >= 100
HEX:
warning: buffer overflow occurs when x >= 0x64
In the above case, decimal is the easier-to-read format.
Example 3
*
an overflow that occurs when x >= 0x7fff
DECIMAL:
warning: buffer overflow occurs when x >= 2147418112
HEX:
warning: buffer overflow occurs when x >= 0x7fff
In this case, hexadecimal is the easier-to-read format.
Example 4
*
an overflow that occurs when x <= -8000
DECIMAL:
warning: buffer overflow occurs when x <= -8000
HEX:
warning: buffer overflow occurs when x <= -0x1f40
The idea
The idea is a way to choose the printed representation based on the value,
based on the number of "awkward" digits.
On implementation is to assign a cost to a digit based on closeness to zero.
For example, in decimal,
'0' : low cost
'1', '9': medium cost
'2'..'8': high cost
in hexadecimal:i
'0' : low cost
'1', 'f': medium cost
'2'..'e': high cost
We can weight these, say cost 10 for "high", cost 1 for "medium", cost 0 for
"low".
"Cheaper" in this sense should mean "easier for a human to understand"; a rough
measure of the amount of mental effort required by a human reader.
Hence:
example 1:
decimal: 2147483648
10 digits, 9 high cost, 1 medium cost: cost = 91
hexadecimal: 0x8000
8 digits; 1 high cost, 7 low cost: cost = 17
hence hexadecimal is "cheaper", and we use it
example 2:
decimal: 100
3 digits, 1 medium cost, 2 low cost: cost = 1
hexadecimal: 0x64
2 high cost digits: cost = 20
hence decimal is "cheaper", and we use it
example 3:
decimal: 2147418112
10 digits: 4 medium cost, 6 high cost: cost = 64
hexadecimal: 0x7fff
8 digits: 1 high cost, 3 medium cost, 4 low cost: cost = 13
hence hexadecimal is "cheaper", and we use it
example 4:
decimal: -8000
3 low cost digits, 1 high cost: cost = 10
hexadecimal: -0x1f40
1 low cost, 2 medium cost, 1 high cost: cost = 12
hence decimal is "cheaper", and we use it
I guessed at these weightings; there may be better ones.