Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-08-09 Thread Jeff Law
On 08/06/2017 02:07 PM, Martin Sebor wrote:
>>>
>>> You're right that there is no truncation and the effect is
>>> the same but only in the unlikely case when the destination
>>> is empty.  Otherwise the result is truncated.
>> Maybe this is where I'm confused.  How does the destination play into
>> truncation issues?  I've always been under the impression that the
>> destination has to be large enough to hold the result, but that it
>> doesn't affect truncation of the source.
> 
> No, you're right.  It's me who's been confused, either thinking
> of strncpy or -Wstringop-overflow.  The difference between the
> two warnings is just one byte in some cases and I got them mixed
> up.  Sorry about that and thanks for spotting this silly  mistake!
> I've updated the code to issue -Wstringop-overflow here and added
> a better example to the manual.
Thanks.  I kept looking thinking I must have missed something somewhere...


> 
> 1) In the following, the strncpy call would normally trigger
> -Wstringop-truncation because of the possible missing terminating
> NUL, but because the immediately following statement inserts the
> NUL the call is safe.
> 
>strncpy (d, s, sizeof d);   // possible missing NUL
>d[sizeof d - 1] = '\0'; // okay, NUL add here
At first I wondered if this was an optimization opportunity.  BUt
thinking more about it, I don't think it is, unless you happen to know
that sizeof d == sizeof s, which I doubt happens often enough to matter.


> 
> 2) Building Glibc made me realize that in my effort to detect
> the (common) misuses of strncpy I neglected the original (and
> only intended but now rare) use case of filling a buffer
> without necessarily NUL-terminating it (as in struct dirent::
> d_name).  To allow it the patch adds a new attribute that can
> be used to annotate char arrays and pointers that are intended
> not to be NUL-terminated.  This suppresses the truncation
> warning.  When the patch is approved I'll propose the (trivial)
> Glibc changes.  In the future, the attribute will also let GCC
> warn when passing such objects to functions that expect a NUL-
> terminated string argument (e.g., strlen or strcpy).
Seems reasonable.

> 
> 3) Finally, to add inlining context to diagnostics issued by
> the middle end, I've added a new %G directive to complement
> %K by accepting a gcall* argument.
Also seems reasonable.  I think we've wanted something like this for a
while.


> 
> To make the patch easier to review I've broken it up into
> four parts:
> 
>   1. Add %G.
>   2. Add attribute nostring.
>   3. Implement -Wstringop-truncation and enhance -Wstringop-
>  overflow (the meat of the patch).
>   4. Fix up GCC to compile with the new and enhanced warnings.
I'll try to get to them today.

Thanks,
jeff


Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-08-06 Thread Martin Sebor

On 08/02/2017 10:58 AM, Jeff Law wrote:

On 07/31/2017 01:42 PM, Martin Sebor wrote:

So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a
INTEGER_CST, it could be a non-constant expression for the size.  Are
the callers of compute_objsize prepared to handle that?  Just to be
clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an
INTEGER_CST, I'm just worried about the callers ability to handle that
correctly.


They should be prepared for it.  If not, it's a bug.  I've added
a few more test cases though I'm not sure the case you're concerned
about actually arises (VLA sizes are represented as gimple calls to
__builtin_alloca_with_align so the code doesn't get this far).

It may in the end be a non-issue because VLAs are still represented as
alloca calls at this point.  BUt it looks like the result is typically
passed down into check_sizes which assumes the result is an INTEGER_CST
based on a quick scan.


So just to be future safe perhaps this:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST)
  return TYPE_SIZE_UNIT (type);


Sure.  I've added something close to the latest patch.


+@smallexample
+void append (char *d)
+@{
+  strncat (d, ".txt", 4);
+@}
+@end smallexample

Sorry.  I don't follow this particular example.  Where's the truncation
when strlen (SRC) == N for strncat?  In that case aren't we going to
append SRC without the nul terminator, then nul terminate the result?
Isn't that the same as appending SRC with its nul terminator?  Am I
missing something here?


You're right that there is no truncation and the effect is
the same but only in the unlikely case when the destination
is empty.  Otherwise the result is truncated.

Maybe this is where I'm confused.  How does the destination play into
truncation issues?  I've always been under the impression that the
destination has to be large enough to hold the result, but that it
doesn't affect truncation of the source.


No, you're right.  It's me who's been confused, either thinking
of strncpy or -Wstringop-overflow.  The difference between the
two warnings is just one byte in some cases and I got them mixed
up.  Sorry about that and thanks for spotting this silly  mistake!
I've updated the code to issue -Wstringop-overflow here and added
a better example to the manual.


So we've got calls to check the arguments, but not optimize here.  But
the containing function is "strlen_optimize_stmt".

Would it make sense to first call strlen_optimize_stmt to handle the
optimization cases, then to call a new separate function to handle
warnings for the strn* family?


tree-ssa-strlen doesn't handle strncat or strncpy (for the latter
I'm tracking the enhancement in bug 81433).  When the handling is
added I expect check_builtin_{strncat,stxncpy} will be renamed to
handle_builtin_{strncat,stxncpy} to match the existing functions,
and the optimization done there.

Or did you have something else in mind and I missed it?

So do you envision doing both optimization and checking together?  If
so, then I think we'd want to rename strlen_optimize_stmt.


I expect the checking and the optimization to be done either
in the same function, or by calling a new function from the
one that implements the optimization, so I changed the names
as you suggest.

I've also done some more testing with Binutils, GDB, and Glibc
and found a couple of reasonable use cases that the warning
shouldn't trigger on so I added more code to handle it as well.

1) In the following, the strncpy call would normally trigger
-Wstringop-truncation because of the possible missing terminating
NUL, but because the immediately following statement inserts the
NUL the call is safe.

   strncpy (d, s, sizeof d);   // possible missing NUL
   d[sizeof d - 1] = '\0'; // okay, NUL add here

2) Building Glibc made me realize that in my effort to detect
the (common) misuses of strncpy I neglected the original (and
only intended but now rare) use case of filling a buffer
without necessarily NUL-terminating it (as in struct dirent::
d_name).  To allow it the patch adds a new attribute that can
be used to annotate char arrays and pointers that are intended
not to be NUL-terminated.  This suppresses the truncation
warning.  When the patch is approved I'll propose the (trivial)
Glibc changes.  In the future, the attribute will also let GCC
warn when passing such objects to functions that expect a NUL-
terminated string argument (e.g., strlen or strcpy).

3) Finally, to add inlining context to diagnostics issued by
the middle end, I've added a new %G directive to complement
%K by accepting a gcall* argument.

To make the patch easier to review I've broken it up into
four parts:

  1. Add %G.
  2. Add attribute nostring.
  3. Implement -Wstringop-truncation and enhance -Wstringop-
 overflow (the meat of the patch).
  4. Fix up GCC to compile with the new and enhanced warnings.

Martin


Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-08-02 Thread Jeff Law
On 07/31/2017 01:42 PM, Martin Sebor wrote:
>> So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a
>> INTEGER_CST, it could be a non-constant expression for the size.  Are
>> the callers of compute_objsize prepared to handle that?  Just to be
>> clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an
>> INTEGER_CST, I'm just worried about the callers ability to handle that
>> correctly.
> 
> They should be prepared for it.  If not, it's a bug.  I've added
> a few more test cases though I'm not sure the case you're concerned
> about actually arises (VLA sizes are represented as gimple calls to
> __builtin_alloca_with_align so the code doesn't get this far).
It may in the end be a non-issue because VLAs are still represented as
alloca calls at this point.  BUt it looks like the result is typically
passed down into check_sizes which assumes the result is an INTEGER_CST
based on a quick scan.


So just to be future safe perhaps this:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST)
  return TYPE_SIZE_UNIT (type);

>>> +
>>> +@smallexample
>>> +void append (char *d)
>>> +@{
>>> +  strncat (d, ".txt", 4);
>>> +@}
>>> +@end smallexample
>> Sorry.  I don't follow this particular example.  Where's the truncation
>> when strlen (SRC) == N for strncat?  In that case aren't we going to
>> append SRC without the nul terminator, then nul terminate the result?
>> Isn't that the same as appending SRC with its nul terminator?  Am I
>> missing something here?
> 
> You're right that there is no truncation and the effect is
> the same but only in the unlikely case when the destination
> is empty.  Otherwise the result is truncated.
Maybe this is where I'm confused.  How does the destination play into
truncation issues?  I've always been under the impression that the
destination has to be large enough to hold the result, but that it
doesn't affect truncation of the source.


> 
>>
>>> @@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
>>>case BUILT_IN_STPCPY_CHK_CHKP:
>>>  handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
>>>  break;
>>> +
>>> +  case BUILT_IN_STRNCAT:
>>> +  case BUILT_IN_STRNCAT_CHK:
>>> +check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi);
>>> +break;
>>> +
>>> +  case BUILT_IN_STPNCPY:
>>> +  case BUILT_IN_STPNCPY_CHK:
>>> +  case BUILT_IN_STRNCPY:
>>> +  case BUILT_IN_STRNCPY_CHK:
>>> +check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi);
>>> +break;
>>> +
>> So we've got calls to check the arguments, but not optimize here.  But
>> the containing function is "strlen_optimize_stmt".
>>
>> Would it make sense to first call strlen_optimize_stmt to handle the
>> optimization cases, then to call a new separate function to handle
>> warnings for the strn* family?
> 
> tree-ssa-strlen doesn't handle strncat or strncpy (for the latter
> I'm tracking the enhancement in bug 81433).  When the handling is
> added I expect check_builtin_{strncat,stxncpy} will be renamed to
> handle_builtin_{strncat,stxncpy} to match the existing functions,
> and the optimization done there.
> 
> Or did you have something else in mind and I missed it?
So do you envision doing both optimization and checking together?  If
so, then I think we'd want to rename strlen_optimize_stmt.  If
optimization and checking are expected to remain separate we'd want a
new function to handle checking of strncat stpncpy, strncpy and the
caller would look something like this:

  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
{
  check_strstar_arguments (gsi);  // Or whatever we named it
  if (strlen_optimize_stmt ())
gsi_next ();
}

jeff


Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-31 Thread Martin Sebor

So I think the fixes exposed by the new warning are OK to go in as-is
immediately if you wish to do so.  Minor questions on the actual
improved warnings inline.



Sure, thanks.




-static inline tree
+static tree
 compute_objsize (tree dest, int ostype)
 {

...

+  type = TYPE_MAIN_VARIANT (type);
+
+  if (TREE_CODE (type) == ARRAY_TYPE)
+return TYPE_SIZE_UNIT (type);

So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a
INTEGER_CST, it could be a non-constant expression for the size.  Are
the callers of compute_objsize prepared to handle that?  Just to be
clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an
INTEGER_CST, I'm just worried about the callers ability to handle that
correctly.


They should be prepared for it.  If not, it's a bug.  I've added
a few more test cases though I'm not sure the case you're concerned
about actually arises (VLA sizes are represented as gimple calls to
__builtin_alloca_with_align so the code doesn't get this far).




diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d0b9050..e4208d9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5170,6 +5170,57 @@ whether to issue a warning.  Similarly to 
@option{-Wstringop-overflow=3} this
 setting of the option may result in warnings for benign code.
 @end table

+@item -Wstringop-truncation
+@opindex Wstringop-overflow
+@opindex Wno-stringop-overflow


Looks like I have a couple of typos up there.  The option name
should be the same in all three entries.  Let me fix that.


+Warn for calls to bounded string manipulation functions such as @code{strncat},
+@code{strncpy}, and @code{stpncpy} that may either truncate the copied string
+or leave the destination unchanged.
+
+In the following example, the call to @code{strncat} specifies the length
+of the source string as the bound.  If the destination contains a non-empty
+string the copy of the source will be truncated.  Therefore, the call is
+diagnosed.  To avoid the warning use @code{strlen (d) - 4)} as the bound;
+
+@smallexample
+void append (char *d)
+@{
+  strncat (d, ".txt", 4);
+@}
+@end smallexample

Sorry.  I don't follow this particular example.  Where's the truncation
when strlen (SRC) == N for strncat?  In that case aren't we going to
append SRC without the nul terminator, then nul terminate the result?
Isn't that the same as appending SRC with its nul terminator?  Am I
missing something here?


You're right that there is no truncation and the effect is
the same but only in the unlikely case when the destination
is empty.  Otherwise the result is truncated.

(Although well defined, calling strncat with an empty destination
and with the bound as big as the source is a misuse of the API.
The expected way to copy a string is to call one of the copy
functions and the warning relies on this as the basis for the
diagnostic.)



Also your advice for avoiding the warning seems wrong.  Isn't it going
to produce a huge value if strlen (d) < 4?


Right, that advice is wrong.  I had fixed this in my local tree
before posting the patch but must not have updated the patch.
Let me refresh the patch.


 @item -Wsizeof-array-argument
 @opindex Wsizeof-array-argument
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index d94dc9c..5d021f9 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1856,21 +1893,68 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
   return true;
 }

+  if (!nowarn && cmpsrc <= 0)
+   {
+ /* To avoid certain truncation the specified bound should also
+not be equal to (or less than) the length of the source.  */
+ location_t loc = gimple_location (stmt);
+ if (warning_at (loc, OPT_Wstringop_truncation,
+ cmpsrc == 0
+ ? G_("%qD specified bound %E equals source length")
+ : G_("%qD specified bound %E is less than source "
+  "length %u"),
+ gimple_call_fndecl (stmt), len, srclen))
+   gimple_set_no_warning (stmt, true);
+   }

So I think this corresponds to the example above I asked about.  Where's
the truncation when the bound is equal to the source length?


Yes.  Hopefully the above answers the question.


NIT: s/potentiall/potentially/



Sure. Fixed along with the other typo above.




@@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
  case BUILT_IN_STPCPY_CHK_CHKP:
handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
break;
+
+ case BUILT_IN_STRNCAT:
+ case BUILT_IN_STRNCAT_CHK:
+   check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi);
+   break;
+
+ case BUILT_IN_STPNCPY:
+ case BUILT_IN_STPNCPY_CHK:
+ case BUILT_IN_STRNCPY:
+ case BUILT_IN_STRNCPY_CHK:
+   check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi);
+   break;
+

So we've got calls to check the 

Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-31 Thread Jeff Law
On 07/08/2017 02:45 PM, Martin Sebor wrote:
> PR 81117 asks for improved detection of common misuses(*) of
> strncpy and strncat.  The attached patch is my solution.  It
> consists of three related sets of changes:
> 
> 1) Adds a new option, -Wstringop-truncation, that diagnoses calls
> to strncpy, and stpncpy (and also strncat) that truncate the copy.
> This helps highlight the common but incorrect assumption that
> the first two functions NUL-terminate the copy (see, for example,
> CWE-170)  For strncat, it helps detect cases of inadvertent
> truncation of the source string by passing in a bound that's
> less than or equal to its length.
> 
> 2) Enhances -Wstringon-overflow to diagnose calls of the form
> strncpy(D, S, N) where the bound N depends on a call to strlen(S).
> This misuse is common in legacy code when, often in response to
> the adoption of a secure coding initiative, while replacing uses
> of strcpy with strncpy, the engineer either makes a mistake, or
> doesn't have a good enough understanding of how the function works,
> or does only the bare minimum to satisfy the requirement to avoid
> using strcpy without actually improving anything.
> 
> 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of
> the functions to copy an array to a destination of an unknown size
> that specify the size of the array as the bound.  Given the
> pervasive [mis]use of strncpy to bound the copy to the size of
> the destination, instances like this suggest a bug: a possible
> buffer overflow due to an excessive bound (see, for example,
> CWE-806).  In cases when the call is safe, it's equivalent to
> the corresponding call to memcpy which is clearer and can be
> more efficient.
> 
> Martin
> 
> PS By coincidence rather than by intent, the strncat warnings
> are a superset of Clang's -Wstrncat-size.  AFAICS, Clang only
> warns when the destination is an array of known size and
> doesn't have a corresponding warning for strncpy.
> 
> [*] Here's some background into these misuses.
> 
> The purpose of the historical strncpy function introduced in V7
> UNIX was to completely fill an array of chars with data, either
> by copying an initial portion of a source string, or by clearing
> it.  I.e., its purpose wasn't to create NUL-terminated strings.
> An example of its use was to fill the directory entry d_name
> array (dirent::d_name) with the name of a file.
> 
> The original purpose of the strncat function, on the other hand,
> was to append a not necessarily NUL-terminated array of chars
> to a string to form a NUL-terminated concatenation of the two.
> An example use case is appending a directory entry (struct
> dirent::d_name) that need not be NUL-terminated, to form
> a pathname which does.
> 
> Largely due to a misunderstanding of the functions' purpose they
> have become commonly used (and misused) to make "safe," bounded
> string copies by safeguarding against accidentally overflowing
> the destination.  This has led to great many bugs and security
> vulnerabilities.
> 
> 
> gcc-81117.diff
> 
> 
> PR c/81117 - Improve buffer overflow checking in strncpy
> 
> gcc/ChangeLog:
> 
>   PR c/81117
>   * builtins.c (compute_objsize): Handle arrays that
>   compute_builtin_object_size likes to fail for.
>   (check_strncpy_sizes): New function.
>   (expand_builtin_strncpy): Call check_strncpy_sizes.
>   * doc/invoke.texi (-Wsizeof-pointer-memaccess): Document enhancement.
>   (-Wstringop-truncation): Document new option.
>   * gimple-fold.c (gimple_fold_builtin_strncpy): Implement
>   -Wstringop-truncation.
>   (gimple_fold_builtin_strncat): Same.
>   * gimple.c (gimple_build_call_from_tree): Set call location.
>   * tree-ssa-strlen.c (strlen_to_stridx): New global variable.
>   (is_strlen_related_p): New function.
>   (check_builtin_strxncpy, check_builtin_strncat): Same.
>   handle_builtin_strlen): Use strlen_to_stridx.
>   (strlen_optimize_stmt): Handle flavors of strncat, strncpy, and
>   stpncpy.
>   Use strlen_to_stridx.
>   (pass_strlen::execute): Release strlen_to_stridx.
> 
> gcc/ada/ChangeLog:
> 
>   PR c/81117
>   * adadecode.c (__gnat_decode): Replace pointless strncpy with
>   memcpy.
>   * argv.c (__gnat_fill_arg): Same.
> 
> gcc/c-family/ChangeLog:
> 
>   PR c/81117
>   * c-common.c (resort_sorted_fields): Replace pointless strncpy
>   with memcpy.
>   * c-warn.c (sizeof_pointer_memaccess_warning): Handle arrays.
>   * c.opt (-Wstriingop-truncation): New option.
> 
> gcc/fortran/ChangeLog:
> 
>   PR c/81117
>   * decl.c (build_sym): Replace pointless strncpy with strcpy.
> 
> gcc/objc/ChangeLog:
> 
>   PR c/81117
>   * objc-encoding.c (encode_type): Replace pointless strncpy with
>   memcpy.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/81117
>   * c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
>   * c-c++-common/Wstringop-overflow.c: New test.
> 

[PING #2] [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-24 Thread Martin Sebor

Ping #2:

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00411.html



On 07/08/2017 02:45 PM, Martin Sebor wrote:

PR 81117 asks for improved detection of common misuses(*) of
strncpy and strncat.  The attached patch is my solution.  It
consists of three related sets of changes:

1) Adds a new option, -Wstringop-truncation, that diagnoses calls
to strncpy, and stpncpy (and also strncat) that truncate the copy.
This helps highlight the common but incorrect assumption that
the first two functions NUL-terminate the copy (see, for example,
CWE-170)  For strncat, it helps detect cases of inadvertent
truncation of the source string by passing in a bound that's
less than or equal to its length.

2) Enhances -Wstringon-overflow to diagnose calls of the form
strncpy(D, S, N) where the bound N depends on a call to strlen(S).
This misuse is common in legacy code when, often in response to
the adoption of a secure coding initiative, while replacing uses
of strcpy with strncpy, the engineer either makes a mistake, or
doesn't have a good enough understanding of how the function works,
or does only the bare minimum to satisfy the requirement to avoid
using strcpy without actually improving anything.

3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of
the functions to copy an array to a destination of an unknown size
that specify the size of the array as the bound.  Given the
pervasive [mis]use of strncpy to bound the copy to the size of
the destination, instances like this suggest a bug: a possible
buffer overflow due to an excessive bound (see, for example,
CWE-806).  In cases when the call is safe, it's equivalent to
the corresponding call to memcpy which is clearer and can be
more efficient.

Martin

PS By coincidence rather than by intent, the strncat warnings
are a superset of Clang's -Wstrncat-size.  AFAICS, Clang only
warns when the destination is an array of known size and
doesn't have a corresponding warning for strncpy.

[*] Here's some background into these misuses.

The purpose of the historical strncpy function introduced in V7
UNIX was to completely fill an array of chars with data, either
by copying an initial portion of a source string, or by clearing
it.  I.e., its purpose wasn't to create NUL-terminated strings.
An example of its use was to fill the directory entry d_name
array (dirent::d_name) with the name of a file.

The original purpose of the strncat function, on the other hand,
was to append a not necessarily NUL-terminated array of chars
to a string to form a NUL-terminated concatenation of the two.
An example use case is appending a directory entry (struct
dirent::d_name) that need not be NUL-terminated, to form
a pathname which does.

Largely due to a misunderstanding of the functions' purpose they
have become commonly used (and misused) to make "safe," bounded
string copies by safeguarding against accidentally overflowing
the destination.  This has led to great many bugs and security
vulnerabilities.







[PING] [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-17 Thread Martin Sebor

Ping:

  https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00411.html

On 07/08/2017 02:45 PM, Martin Sebor wrote:

PR 81117 asks for improved detection of common misuses(*) of
strncpy and strncat.  The attached patch is my solution.  It
consists of three related sets of changes:

1) Adds a new option, -Wstringop-truncation, that diagnoses calls
to strncpy, and stpncpy (and also strncat) that truncate the copy.
This helps highlight the common but incorrect assumption that
the first two functions NUL-terminate the copy (see, for example,
CWE-170)  For strncat, it helps detect cases of inadvertent
truncation of the source string by passing in a bound that's
less than or equal to its length.

2) Enhances -Wstringon-overflow to diagnose calls of the form
strncpy(D, S, N) where the bound N depends on a call to strlen(S).
This misuse is common in legacy code when, often in response to
the adoption of a secure coding initiative, while replacing uses
of strcpy with strncpy, the engineer either makes a mistake, or
doesn't have a good enough understanding of how the function works,
or does only the bare minimum to satisfy the requirement to avoid
using strcpy without actually improving anything.

3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of
the functions to copy an array to a destination of an unknown size
that specify the size of the array as the bound.  Given the
pervasive [mis]use of strncpy to bound the copy to the size of
the destination, instances like this suggest a bug: a possible
buffer overflow due to an excessive bound (see, for example,
CWE-806).  In cases when the call is safe, it's equivalent to
the corresponding call to memcpy which is clearer and can be
more efficient.

Martin

PS By coincidence rather than by intent, the strncat warnings
are a superset of Clang's -Wstrncat-size.  AFAICS, Clang only
warns when the destination is an array of known size and
doesn't have a corresponding warning for strncpy.

[*] Here's some background into these misuses.

The purpose of the historical strncpy function introduced in V7
UNIX was to completely fill an array of chars with data, either
by copying an initial portion of a source string, or by clearing
it.  I.e., its purpose wasn't to create NUL-terminated strings.
An example of its use was to fill the directory entry d_name
array (dirent::d_name) with the name of a file.

The original purpose of the strncat function, on the other hand,
was to append a not necessarily NUL-terminated array of chars
to a string to form a NUL-terminated concatenation of the two.
An example use case is appending a directory entry (struct
dirent::d_name) that need not be NUL-terminated, to form
a pathname which does.

Largely due to a misunderstanding of the functions' purpose they
have become commonly used (and misused) to make "safe," bounded
string copies by safeguarding against accidentally overflowing
the destination.  This has led to great many bugs and security
vulnerabilities.





[PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-08 Thread Martin Sebor

PR 81117 asks for improved detection of common misuses(*) of
strncpy and strncat.  The attached patch is my solution.  It
consists of three related sets of changes:

1) Adds a new option, -Wstringop-truncation, that diagnoses calls
to strncpy, and stpncpy (and also strncat) that truncate the copy.
This helps highlight the common but incorrect assumption that
the first two functions NUL-terminate the copy (see, for example,
CWE-170)  For strncat, it helps detect cases of inadvertent
truncation of the source string by passing in a bound that's
less than or equal to its length.

2) Enhances -Wstringon-overflow to diagnose calls of the form
strncpy(D, S, N) where the bound N depends on a call to strlen(S).
This misuse is common in legacy code when, often in response to
the adoption of a secure coding initiative, while replacing uses
of strcpy with strncpy, the engineer either makes a mistake, or
doesn't have a good enough understanding of how the function works,
or does only the bare minimum to satisfy the requirement to avoid
using strcpy without actually improving anything.

3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of
the functions to copy an array to a destination of an unknown size
that specify the size of the array as the bound.  Given the
pervasive [mis]use of strncpy to bound the copy to the size of
the destination, instances like this suggest a bug: a possible
buffer overflow due to an excessive bound (see, for example,
CWE-806).  In cases when the call is safe, it's equivalent to
the corresponding call to memcpy which is clearer and can be
more efficient.

Martin

PS By coincidence rather than by intent, the strncat warnings
are a superset of Clang's -Wstrncat-size.  AFAICS, Clang only
warns when the destination is an array of known size and
doesn't have a corresponding warning for strncpy.

[*] Here's some background into these misuses.

The purpose of the historical strncpy function introduced in V7
UNIX was to completely fill an array of chars with data, either
by copying an initial portion of a source string, or by clearing
it.  I.e., its purpose wasn't to create NUL-terminated strings.
An example of its use was to fill the directory entry d_name
array (dirent::d_name) with the name of a file.

The original purpose of the strncat function, on the other hand,
was to append a not necessarily NUL-terminated array of chars
to a string to form a NUL-terminated concatenation of the two.
An example use case is appending a directory entry (struct
dirent::d_name) that need not be NUL-terminated, to form
a pathname which does.

Largely due to a misunderstanding of the functions' purpose they
have become commonly used (and misused) to make "safe," bounded
string copies by safeguarding against accidentally overflowing
the destination.  This has led to great many bugs and security 
vulnerabilities.


PR c/81117 - Improve buffer overflow checking in strncpy

gcc/ChangeLog:

	PR c/81117
	* builtins.c (compute_objsize): Handle arrays that
	compute_builtin_object_size likes to fail for.
	(check_strncpy_sizes): New function.
	(expand_builtin_strncpy): Call check_strncpy_sizes.
	* doc/invoke.texi (-Wsizeof-pointer-memaccess): Document enhancement.
	(-Wstringop-truncation): Document new option.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Implement
	-Wstringop-truncation.
	(gimple_fold_builtin_strncat): Same.
	* gimple.c (gimple_build_call_from_tree): Set call location.
	* tree-ssa-strlen.c (strlen_to_stridx): New global variable.
	(is_strlen_related_p): New function.
	(check_builtin_strxncpy, check_builtin_strncat): Same.
	handle_builtin_strlen): Use strlen_to_stridx.
	(strlen_optimize_stmt): Handle flavors of strncat, strncpy, and
	stpncpy.
	Use strlen_to_stridx.
	(pass_strlen::execute): Release strlen_to_stridx.

gcc/ada/ChangeLog:

	PR c/81117
	* adadecode.c (__gnat_decode): Replace pointless strncpy with
	memcpy.
	* argv.c (__gnat_fill_arg): Same.

gcc/c-family/ChangeLog:

	PR c/81117
	* c-common.c (resort_sorted_fields): Replace pointless strncpy
	with memcpy.
	* c-warn.c (sizeof_pointer_memaccess_warning): Handle arrays.
	* c.opt (-Wstriingop-truncation): New option.

gcc/fortran/ChangeLog:

	PR c/81117
	* decl.c (build_sym): Replace pointless strncpy with strcpy.

gcc/objc/ChangeLog:

	PR c/81117
	* objc-encoding.c (encode_type): Replace pointless strncpy with
	memcpy.

gcc/testsuite/ChangeLog:

	PR c/81117
	* c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
	* c-c++-common/Wstringop-overflow.c: New test.
	* c-c++-common/Wstringop-truncation.c: New test.
	* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust.
	* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
	* gcc.dg/Walloca-1.c: Disable macro tracking.

diff --git a/gcc/ada/adadecode.c b/gcc/ada/adadecode.c
index 8c9c7ab..0cbef81 100644
--- a/gcc/ada/adadecode.c
+++ b/gcc/ada/adadecode.c
@@ -330,7 +330,7 @@ __gnat_decode (const char *coded_name, char *ada_name, int verbose)
 	  }
 
 	/* Write symbol in the space.