On 3/17/22 06:35, Jonathan Wakely via Gcc-patches wrote:
On 15/03/22 14:36 -0600, Martin Sebor wrote:
On 3/15/22 10:40, Siddhesh Poyarekar wrote:
On 15/03/2022 21:09, Martin Sebor wrote:
The strncmp function takes arrays as arguments (not necessarily
strings).? The main purpose of the -Wstringop-overread warning
for calls to it is to detect calls where one of the arrays is
not a nul-terminated string and the bound is larger than the size
of the array.? For example:

?? char a[4], b[4];

?? int f (void)
?? {
???? return strncmp (a, b, 8);?? // -Wstringop-overread
?? }

Such a call is suspect: if one of the arrays isn't nul-terminated
the call is undefined.? Otherwise, if both are nul-terminated there

Isn't "suspect" too harsh a description though?? The bound does not
specify the size of a or b, it specifies the maximum extent to which to
compare a and b, the extent being any application-specific limit.? In
fact the limit could be the size of some arbitrary third buffer that the
contents of a or b must be copied to, truncating to the bound.

The intended use of the strncmp bound is to limit the comparison to
at most the size of the arrays or (in a subset of cases) the length
of an initial substring. Providing an arbitrary bound that's not
related to the sizes as you describe sounds very much like a misuse.

As a historical note, strncmp was first introduced in UNIX v7 where
its purpose, alongside strncpy, was to manipulate (potentially)
unterminated character arrays like file names stored in fixed size
arrays (typically 14 bytes).  Strncpy would fill the buffers with
ASCII data up to their size and pad the rest with nuls only if there
was room.

Strncmp was then used to compare these potentially unterminated
character arrays (e.g., archive headers in ld and ranlib).  The bound
was the size of the fixed size array.  Its other use case was to compare
leading portions of strings (e.g, when looking for an environment
variable or when stripping "./" from path names).

Since the early UNIX days, both strncpy and to a lesser extent strncmp
have been widely misused and, along with many other functions in
<string.h>, a frequent source of bugs due to common misunderstanding
of their intended purpose.  The aim of these warnings is to detect
the common (and sometimes less common) misuses and bugs.

I agree the call is undefined if one of the arrays is not nul-terminated
and that's the thing; nothing about the bound is undefined in this
context, it's the NUL termination that is key.

is no point in calling strncmp with a bound greater than their sizes.

There is, when the bound describes something else, e.g. the size of a
third destination buffer into which one of the input buffers may get
copied into.? Or when the bound describes the maximum length of a set of
strings where only a subset of the strings are reachable in the current
function and ranger sees it, allowing us to reduce our input string size
estimate.? The bounds being the maximum of the lengths of two input
strings is just one of many possibilities.

With no evidence that this warning is ever harmful I'd consider

There is, the false positives were seen in Fedora/RHEL builds.

I haven't seen these so I can't very well comment on them.  But I can
assure you that warning for the code above is intentional.  Whether

I don't think anybody is saying it wasn't intentional, the point is
that we can change our minds and do it differently based on feedback
and usage experience.

If users no longer have faith in these warnings and just disable them
or ignore them, then the warnings do not find real bugs and are not
fit for purpose.

or not the arrays are nul-terminated, the expected way to call
the function is with a bound no greater than their size (some coding
guidelines are explicit about this; see for example the CERT C Secure
Coding standard rule ARR38-C).

That's fine. It shouldn't be in -Wall though.

Perhaps a suitable compromise would be to add a separate warning flag specifically for the strn* warnings, so users deliberately using the bound to express a limit other than the length of the argument string (and confident that their strings are always NUL-terminated) can turn them off without turning off all the overread warnings.

Jason

Reply via email to