Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
On 08/06/2017 02:07 PM, Martin Sebor wrote: >>> >>> You're right that there is no truncation and the effect is >>> the same but only in the unlikely case when the destination >>> is empty. Otherwise the result is truncated. >> Maybe this is where I'm confused. How does the destination play into >> truncation issues? I've always been under the impression that the >> destination has to be large enough to hold the result, but that it >> doesn't affect truncation of the source. > > No, you're right. It's me who's been confused, either thinking > of strncpy or -Wstringop-overflow. The difference between the > two warnings is just one byte in some cases and I got them mixed > up. Sorry about that and thanks for spotting this silly mistake! > I've updated the code to issue -Wstringop-overflow here and added > a better example to the manual. Thanks. I kept looking thinking I must have missed something somewhere... > > 1) In the following, the strncpy call would normally trigger > -Wstringop-truncation because of the possible missing terminating > NUL, but because the immediately following statement inserts the > NUL the call is safe. > >strncpy (d, s, sizeof d); // possible missing NUL >d[sizeof d - 1] = '\0'; // okay, NUL add here At first I wondered if this was an optimization opportunity. BUt thinking more about it, I don't think it is, unless you happen to know that sizeof d == sizeof s, which I doubt happens often enough to matter. > > 2) Building Glibc made me realize that in my effort to detect > the (common) misuses of strncpy I neglected the original (and > only intended but now rare) use case of filling a buffer > without necessarily NUL-terminating it (as in struct dirent:: > d_name). To allow it the patch adds a new attribute that can > be used to annotate char arrays and pointers that are intended > not to be NUL-terminated. This suppresses the truncation > warning. When the patch is approved I'll propose the (trivial) > Glibc changes. In the future, the attribute will also let GCC > warn when passing such objects to functions that expect a NUL- > terminated string argument (e.g., strlen or strcpy). Seems reasonable. > > 3) Finally, to add inlining context to diagnostics issued by > the middle end, I've added a new %G directive to complement > %K by accepting a gcall* argument. Also seems reasonable. I think we've wanted something like this for a while. > > To make the patch easier to review I've broken it up into > four parts: > > 1. Add %G. > 2. Add attribute nostring. > 3. Implement -Wstringop-truncation and enhance -Wstringop- > overflow (the meat of the patch). > 4. Fix up GCC to compile with the new and enhanced warnings. I'll try to get to them today. Thanks, jeff
Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
On 08/02/2017 10:58 AM, Jeff Law wrote: On 07/31/2017 01:42 PM, Martin Sebor wrote: So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a INTEGER_CST, it could be a non-constant expression for the size. Are the callers of compute_objsize prepared to handle that? Just to be clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an INTEGER_CST, I'm just worried about the callers ability to handle that correctly. They should be prepared for it. If not, it's a bug. I've added a few more test cases though I'm not sure the case you're concerned about actually arises (VLA sizes are represented as gimple calls to __builtin_alloca_with_align so the code doesn't get this far). It may in the end be a non-issue because VLAs are still represented as alloca calls at this point. BUt it looks like the result is typically passed down into check_sizes which assumes the result is an INTEGER_CST based on a quick scan. So just to be future safe perhaps this: if (TREE_CODE (type) == ARRAY_TYPE && TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST) return TYPE_SIZE_UNIT (type); Sure. I've added something close to the latest patch. +@smallexample +void append (char *d) +@{ + strncat (d, ".txt", 4); +@} +@end smallexample Sorry. I don't follow this particular example. Where's the truncation when strlen (SRC) == N for strncat? In that case aren't we going to append SRC without the nul terminator, then nul terminate the result? Isn't that the same as appending SRC with its nul terminator? Am I missing something here? You're right that there is no truncation and the effect is the same but only in the unlikely case when the destination is empty. Otherwise the result is truncated. Maybe this is where I'm confused. How does the destination play into truncation issues? I've always been under the impression that the destination has to be large enough to hold the result, but that it doesn't affect truncation of the source. No, you're right. It's me who's been confused, either thinking of strncpy or -Wstringop-overflow. The difference between the two warnings is just one byte in some cases and I got them mixed up. Sorry about that and thanks for spotting this silly mistake! I've updated the code to issue -Wstringop-overflow here and added a better example to the manual. So we've got calls to check the arguments, but not optimize here. But the containing function is "strlen_optimize_stmt". Would it make sense to first call strlen_optimize_stmt to handle the optimization cases, then to call a new separate function to handle warnings for the strn* family? tree-ssa-strlen doesn't handle strncat or strncpy (for the latter I'm tracking the enhancement in bug 81433). When the handling is added I expect check_builtin_{strncat,stxncpy} will be renamed to handle_builtin_{strncat,stxncpy} to match the existing functions, and the optimization done there. Or did you have something else in mind and I missed it? So do you envision doing both optimization and checking together? If so, then I think we'd want to rename strlen_optimize_stmt. I expect the checking and the optimization to be done either in the same function, or by calling a new function from the one that implements the optimization, so I changed the names as you suggest. I've also done some more testing with Binutils, GDB, and Glibc and found a couple of reasonable use cases that the warning shouldn't trigger on so I added more code to handle it as well. 1) In the following, the strncpy call would normally trigger -Wstringop-truncation because of the possible missing terminating NUL, but because the immediately following statement inserts the NUL the call is safe. strncpy (d, s, sizeof d); // possible missing NUL d[sizeof d - 1] = '\0'; // okay, NUL add here 2) Building Glibc made me realize that in my effort to detect the (common) misuses of strncpy I neglected the original (and only intended but now rare) use case of filling a buffer without necessarily NUL-terminating it (as in struct dirent:: d_name). To allow it the patch adds a new attribute that can be used to annotate char arrays and pointers that are intended not to be NUL-terminated. This suppresses the truncation warning. When the patch is approved I'll propose the (trivial) Glibc changes. In the future, the attribute will also let GCC warn when passing such objects to functions that expect a NUL- terminated string argument (e.g., strlen or strcpy). 3) Finally, to add inlining context to diagnostics issued by the middle end, I've added a new %G directive to complement %K by accepting a gcall* argument. To make the patch easier to review I've broken it up into four parts: 1. Add %G. 2. Add attribute nostring. 3. Implement -Wstringop-truncation and enhance -Wstringop- overflow (the meat of the patch). 4. Fix up GCC to compile with the new and enhanced warnings. Martin
Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
On 07/31/2017 01:42 PM, Martin Sebor wrote: >> So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a >> INTEGER_CST, it could be a non-constant expression for the size. Are >> the callers of compute_objsize prepared to handle that? Just to be >> clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an >> INTEGER_CST, I'm just worried about the callers ability to handle that >> correctly. > > They should be prepared for it. If not, it's a bug. I've added > a few more test cases though I'm not sure the case you're concerned > about actually arises (VLA sizes are represented as gimple calls to > __builtin_alloca_with_align so the code doesn't get this far). It may in the end be a non-issue because VLAs are still represented as alloca calls at this point. BUt it looks like the result is typically passed down into check_sizes which assumes the result is an INTEGER_CST based on a quick scan. So just to be future safe perhaps this: if (TREE_CODE (type) == ARRAY_TYPE && TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST) return TYPE_SIZE_UNIT (type); >>> + >>> +@smallexample >>> +void append (char *d) >>> +@{ >>> + strncat (d, ".txt", 4); >>> +@} >>> +@end smallexample >> Sorry. I don't follow this particular example. Where's the truncation >> when strlen (SRC) == N for strncat? In that case aren't we going to >> append SRC without the nul terminator, then nul terminate the result? >> Isn't that the same as appending SRC with its nul terminator? Am I >> missing something here? > > You're right that there is no truncation and the effect is > the same but only in the unlikely case when the destination > is empty. Otherwise the result is truncated. Maybe this is where I'm confused. How does the destination play into truncation issues? I've always been under the impression that the destination has to be large enough to hold the result, but that it doesn't affect truncation of the source. > >> >>> @@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi) >>>case BUILT_IN_STPCPY_CHK_CHKP: >>> handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi); >>> break; >>> + >>> + case BUILT_IN_STRNCAT: >>> + case BUILT_IN_STRNCAT_CHK: >>> +check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi); >>> +break; >>> + >>> + case BUILT_IN_STPNCPY: >>> + case BUILT_IN_STPNCPY_CHK: >>> + case BUILT_IN_STRNCPY: >>> + case BUILT_IN_STRNCPY_CHK: >>> +check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi); >>> +break; >>> + >> So we've got calls to check the arguments, but not optimize here. But >> the containing function is "strlen_optimize_stmt". >> >> Would it make sense to first call strlen_optimize_stmt to handle the >> optimization cases, then to call a new separate function to handle >> warnings for the strn* family? > > tree-ssa-strlen doesn't handle strncat or strncpy (for the latter > I'm tracking the enhancement in bug 81433). When the handling is > added I expect check_builtin_{strncat,stxncpy} will be renamed to > handle_builtin_{strncat,stxncpy} to match the existing functions, > and the optimization done there. > > Or did you have something else in mind and I missed it? So do you envision doing both optimization and checking together? If so, then I think we'd want to rename strlen_optimize_stmt. If optimization and checking are expected to remain separate we'd want a new function to handle checking of strncat stpncpy, strncpy and the caller would look something like this: for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) { check_strstar_arguments (gsi); // Or whatever we named it if (strlen_optimize_stmt ()) gsi_next (); } jeff
Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
So I think the fixes exposed by the new warning are OK to go in as-is immediately if you wish to do so. Minor questions on the actual improved warnings inline. Sure, thanks. -static inline tree +static tree compute_objsize (tree dest, int ostype) { ... + type = TYPE_MAIN_VARIANT (type); + + if (TREE_CODE (type) == ARRAY_TYPE) +return TYPE_SIZE_UNIT (type); So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a INTEGER_CST, it could be a non-constant expression for the size. Are the callers of compute_objsize prepared to handle that? Just to be clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an INTEGER_CST, I'm just worried about the callers ability to handle that correctly. They should be prepared for it. If not, it's a bug. I've added a few more test cases though I'm not sure the case you're concerned about actually arises (VLA sizes are represented as gimple calls to __builtin_alloca_with_align so the code doesn't get this far). diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d0b9050..e4208d9 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -5170,6 +5170,57 @@ whether to issue a warning. Similarly to @option{-Wstringop-overflow=3} this setting of the option may result in warnings for benign code. @end table +@item -Wstringop-truncation +@opindex Wstringop-overflow +@opindex Wno-stringop-overflow Looks like I have a couple of typos up there. The option name should be the same in all three entries. Let me fix that. +Warn for calls to bounded string manipulation functions such as @code{strncat}, +@code{strncpy}, and @code{stpncpy} that may either truncate the copied string +or leave the destination unchanged. + +In the following example, the call to @code{strncat} specifies the length +of the source string as the bound. If the destination contains a non-empty +string the copy of the source will be truncated. Therefore, the call is +diagnosed. To avoid the warning use @code{strlen (d) - 4)} as the bound; + +@smallexample +void append (char *d) +@{ + strncat (d, ".txt", 4); +@} +@end smallexample Sorry. I don't follow this particular example. Where's the truncation when strlen (SRC) == N for strncat? In that case aren't we going to append SRC without the nul terminator, then nul terminate the result? Isn't that the same as appending SRC with its nul terminator? Am I missing something here? You're right that there is no truncation and the effect is the same but only in the unlikely case when the destination is empty. Otherwise the result is truncated. (Although well defined, calling strncat with an empty destination and with the bound as big as the source is a misuse of the API. The expected way to copy a string is to call one of the copy functions and the warning relies on this as the basis for the diagnostic.) Also your advice for avoiding the warning seems wrong. Isn't it going to produce a huge value if strlen (d) < 4? Right, that advice is wrong. I had fixed this in my local tree before posting the patch but must not have updated the patch. Let me refresh the patch. @item -Wsizeof-array-argument @opindex Wsizeof-array-argument diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index d94dc9c..5d021f9 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -1856,21 +1893,68 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi) return true; } + if (!nowarn && cmpsrc <= 0) + { + /* To avoid certain truncation the specified bound should also +not be equal to (or less than) the length of the source. */ + location_t loc = gimple_location (stmt); + if (warning_at (loc, OPT_Wstringop_truncation, + cmpsrc == 0 + ? G_("%qD specified bound %E equals source length") + : G_("%qD specified bound %E is less than source " + "length %u"), + gimple_call_fndecl (stmt), len, srclen)) + gimple_set_no_warning (stmt, true); + } So I think this corresponds to the example above I asked about. Where's the truncation when the bound is equal to the source length? Yes. Hopefully the above answers the question. NIT: s/potentiall/potentially/ Sure. Fixed along with the other typo above. @@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi) case BUILT_IN_STPCPY_CHK_CHKP: handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi); break; + + case BUILT_IN_STRNCAT: + case BUILT_IN_STRNCAT_CHK: + check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi); + break; + + case BUILT_IN_STPNCPY: + case BUILT_IN_STPNCPY_CHK: + case BUILT_IN_STRNCPY: + case BUILT_IN_STRNCPY_CHK: + check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi); + break; + So we've got calls to check the
Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
On 07/08/2017 02:45 PM, Martin Sebor wrote: > PR 81117 asks for improved detection of common misuses(*) of > strncpy and strncat. The attached patch is my solution. It > consists of three related sets of changes: > > 1) Adds a new option, -Wstringop-truncation, that diagnoses calls > to strncpy, and stpncpy (and also strncat) that truncate the copy. > This helps highlight the common but incorrect assumption that > the first two functions NUL-terminate the copy (see, for example, > CWE-170) For strncat, it helps detect cases of inadvertent > truncation of the source string by passing in a bound that's > less than or equal to its length. > > 2) Enhances -Wstringon-overflow to diagnose calls of the form > strncpy(D, S, N) where the bound N depends on a call to strlen(S). > This misuse is common in legacy code when, often in response to > the adoption of a secure coding initiative, while replacing uses > of strcpy with strncpy, the engineer either makes a mistake, or > doesn't have a good enough understanding of how the function works, > or does only the bare minimum to satisfy the requirement to avoid > using strcpy without actually improving anything. > > 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of > the functions to copy an array to a destination of an unknown size > that specify the size of the array as the bound. Given the > pervasive [mis]use of strncpy to bound the copy to the size of > the destination, instances like this suggest a bug: a possible > buffer overflow due to an excessive bound (see, for example, > CWE-806). In cases when the call is safe, it's equivalent to > the corresponding call to memcpy which is clearer and can be > more efficient. > > Martin > > PS By coincidence rather than by intent, the strncat warnings > are a superset of Clang's -Wstrncat-size. AFAICS, Clang only > warns when the destination is an array of known size and > doesn't have a corresponding warning for strncpy. > > [*] Here's some background into these misuses. > > The purpose of the historical strncpy function introduced in V7 > UNIX was to completely fill an array of chars with data, either > by copying an initial portion of a source string, or by clearing > it. I.e., its purpose wasn't to create NUL-terminated strings. > An example of its use was to fill the directory entry d_name > array (dirent::d_name) with the name of a file. > > The original purpose of the strncat function, on the other hand, > was to append a not necessarily NUL-terminated array of chars > to a string to form a NUL-terminated concatenation of the two. > An example use case is appending a directory entry (struct > dirent::d_name) that need not be NUL-terminated, to form > a pathname which does. > > Largely due to a misunderstanding of the functions' purpose they > have become commonly used (and misused) to make "safe," bounded > string copies by safeguarding against accidentally overflowing > the destination. This has led to great many bugs and security > vulnerabilities. > > > gcc-81117.diff > > > PR c/81117 - Improve buffer overflow checking in strncpy > > gcc/ChangeLog: > > PR c/81117 > * builtins.c (compute_objsize): Handle arrays that > compute_builtin_object_size likes to fail for. > (check_strncpy_sizes): New function. > (expand_builtin_strncpy): Call check_strncpy_sizes. > * doc/invoke.texi (-Wsizeof-pointer-memaccess): Document enhancement. > (-Wstringop-truncation): Document new option. > * gimple-fold.c (gimple_fold_builtin_strncpy): Implement > -Wstringop-truncation. > (gimple_fold_builtin_strncat): Same. > * gimple.c (gimple_build_call_from_tree): Set call location. > * tree-ssa-strlen.c (strlen_to_stridx): New global variable. > (is_strlen_related_p): New function. > (check_builtin_strxncpy, check_builtin_strncat): Same. > handle_builtin_strlen): Use strlen_to_stridx. > (strlen_optimize_stmt): Handle flavors of strncat, strncpy, and > stpncpy. > Use strlen_to_stridx. > (pass_strlen::execute): Release strlen_to_stridx. > > gcc/ada/ChangeLog: > > PR c/81117 > * adadecode.c (__gnat_decode): Replace pointless strncpy with > memcpy. > * argv.c (__gnat_fill_arg): Same. > > gcc/c-family/ChangeLog: > > PR c/81117 > * c-common.c (resort_sorted_fields): Replace pointless strncpy > with memcpy. > * c-warn.c (sizeof_pointer_memaccess_warning): Handle arrays. > * c.opt (-Wstriingop-truncation): New option. > > gcc/fortran/ChangeLog: > > PR c/81117 > * decl.c (build_sym): Replace pointless strncpy with strcpy. > > gcc/objc/ChangeLog: > > PR c/81117 > * objc-encoding.c (encode_type): Replace pointless strncpy with > memcpy. > > gcc/testsuite/ChangeLog: > > PR c/81117 > * c-c++-common/Wsizeof-pointer-memaccess3.c: New test. > * c-c++-common/Wstringop-overflow.c: New test. >
[PING #2] [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
Ping #2: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00411.html On 07/08/2017 02:45 PM, Martin Sebor wrote: PR 81117 asks for improved detection of common misuses(*) of strncpy and strncat. The attached patch is my solution. It consists of three related sets of changes: 1) Adds a new option, -Wstringop-truncation, that diagnoses calls to strncpy, and stpncpy (and also strncat) that truncate the copy. This helps highlight the common but incorrect assumption that the first two functions NUL-terminate the copy (see, for example, CWE-170) For strncat, it helps detect cases of inadvertent truncation of the source string by passing in a bound that's less than or equal to its length. 2) Enhances -Wstringon-overflow to diagnose calls of the form strncpy(D, S, N) where the bound N depends on a call to strlen(S). This misuse is common in legacy code when, often in response to the adoption of a secure coding initiative, while replacing uses of strcpy with strncpy, the engineer either makes a mistake, or doesn't have a good enough understanding of how the function works, or does only the bare minimum to satisfy the requirement to avoid using strcpy without actually improving anything. 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of the functions to copy an array to a destination of an unknown size that specify the size of the array as the bound. Given the pervasive [mis]use of strncpy to bound the copy to the size of the destination, instances like this suggest a bug: a possible buffer overflow due to an excessive bound (see, for example, CWE-806). In cases when the call is safe, it's equivalent to the corresponding call to memcpy which is clearer and can be more efficient. Martin PS By coincidence rather than by intent, the strncat warnings are a superset of Clang's -Wstrncat-size. AFAICS, Clang only warns when the destination is an array of known size and doesn't have a corresponding warning for strncpy. [*] Here's some background into these misuses. The purpose of the historical strncpy function introduced in V7 UNIX was to completely fill an array of chars with data, either by copying an initial portion of a source string, or by clearing it. I.e., its purpose wasn't to create NUL-terminated strings. An example of its use was to fill the directory entry d_name array (dirent::d_name) with the name of a file. The original purpose of the strncat function, on the other hand, was to append a not necessarily NUL-terminated array of chars to a string to form a NUL-terminated concatenation of the two. An example use case is appending a directory entry (struct dirent::d_name) that need not be NUL-terminated, to form a pathname which does. Largely due to a misunderstanding of the functions' purpose they have become commonly used (and misused) to make "safe," bounded string copies by safeguarding against accidentally overflowing the destination. This has led to great many bugs and security vulnerabilities.
[PING] [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
Ping: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00411.html On 07/08/2017 02:45 PM, Martin Sebor wrote: PR 81117 asks for improved detection of common misuses(*) of strncpy and strncat. The attached patch is my solution. It consists of three related sets of changes: 1) Adds a new option, -Wstringop-truncation, that diagnoses calls to strncpy, and stpncpy (and also strncat) that truncate the copy. This helps highlight the common but incorrect assumption that the first two functions NUL-terminate the copy (see, for example, CWE-170) For strncat, it helps detect cases of inadvertent truncation of the source string by passing in a bound that's less than or equal to its length. 2) Enhances -Wstringon-overflow to diagnose calls of the form strncpy(D, S, N) where the bound N depends on a call to strlen(S). This misuse is common in legacy code when, often in response to the adoption of a secure coding initiative, while replacing uses of strcpy with strncpy, the engineer either makes a mistake, or doesn't have a good enough understanding of how the function works, or does only the bare minimum to satisfy the requirement to avoid using strcpy without actually improving anything. 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of the functions to copy an array to a destination of an unknown size that specify the size of the array as the bound. Given the pervasive [mis]use of strncpy to bound the copy to the size of the destination, instances like this suggest a bug: a possible buffer overflow due to an excessive bound (see, for example, CWE-806). In cases when the call is safe, it's equivalent to the corresponding call to memcpy which is clearer and can be more efficient. Martin PS By coincidence rather than by intent, the strncat warnings are a superset of Clang's -Wstrncat-size. AFAICS, Clang only warns when the destination is an array of known size and doesn't have a corresponding warning for strncpy. [*] Here's some background into these misuses. The purpose of the historical strncpy function introduced in V7 UNIX was to completely fill an array of chars with data, either by copying an initial portion of a source string, or by clearing it. I.e., its purpose wasn't to create NUL-terminated strings. An example of its use was to fill the directory entry d_name array (dirent::d_name) with the name of a file. The original purpose of the strncat function, on the other hand, was to append a not necessarily NUL-terminated array of chars to a string to form a NUL-terminated concatenation of the two. An example use case is appending a directory entry (struct dirent::d_name) that need not be NUL-terminated, to form a pathname which does. Largely due to a misunderstanding of the functions' purpose they have become commonly used (and misused) to make "safe," bounded string copies by safeguarding against accidentally overflowing the destination. This has led to great many bugs and security vulnerabilities.
[PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)
PR 81117 asks for improved detection of common misuses(*) of strncpy and strncat. The attached patch is my solution. It consists of three related sets of changes: 1) Adds a new option, -Wstringop-truncation, that diagnoses calls to strncpy, and stpncpy (and also strncat) that truncate the copy. This helps highlight the common but incorrect assumption that the first two functions NUL-terminate the copy (see, for example, CWE-170) For strncat, it helps detect cases of inadvertent truncation of the source string by passing in a bound that's less than or equal to its length. 2) Enhances -Wstringon-overflow to diagnose calls of the form strncpy(D, S, N) where the bound N depends on a call to strlen(S). This misuse is common in legacy code when, often in response to the adoption of a secure coding initiative, while replacing uses of strcpy with strncpy, the engineer either makes a mistake, or doesn't have a good enough understanding of how the function works, or does only the bare minimum to satisfy the requirement to avoid using strcpy without actually improving anything. 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of the functions to copy an array to a destination of an unknown size that specify the size of the array as the bound. Given the pervasive [mis]use of strncpy to bound the copy to the size of the destination, instances like this suggest a bug: a possible buffer overflow due to an excessive bound (see, for example, CWE-806). In cases when the call is safe, it's equivalent to the corresponding call to memcpy which is clearer and can be more efficient. Martin PS By coincidence rather than by intent, the strncat warnings are a superset of Clang's -Wstrncat-size. AFAICS, Clang only warns when the destination is an array of known size and doesn't have a corresponding warning for strncpy. [*] Here's some background into these misuses. The purpose of the historical strncpy function introduced in V7 UNIX was to completely fill an array of chars with data, either by copying an initial portion of a source string, or by clearing it. I.e., its purpose wasn't to create NUL-terminated strings. An example of its use was to fill the directory entry d_name array (dirent::d_name) with the name of a file. The original purpose of the strncat function, on the other hand, was to append a not necessarily NUL-terminated array of chars to a string to form a NUL-terminated concatenation of the two. An example use case is appending a directory entry (struct dirent::d_name) that need not be NUL-terminated, to form a pathname which does. Largely due to a misunderstanding of the functions' purpose they have become commonly used (and misused) to make "safe," bounded string copies by safeguarding against accidentally overflowing the destination. This has led to great many bugs and security vulnerabilities. PR c/81117 - Improve buffer overflow checking in strncpy gcc/ChangeLog: PR c/81117 * builtins.c (compute_objsize): Handle arrays that compute_builtin_object_size likes to fail for. (check_strncpy_sizes): New function. (expand_builtin_strncpy): Call check_strncpy_sizes. * doc/invoke.texi (-Wsizeof-pointer-memaccess): Document enhancement. (-Wstringop-truncation): Document new option. * gimple-fold.c (gimple_fold_builtin_strncpy): Implement -Wstringop-truncation. (gimple_fold_builtin_strncat): Same. * gimple.c (gimple_build_call_from_tree): Set call location. * tree-ssa-strlen.c (strlen_to_stridx): New global variable. (is_strlen_related_p): New function. (check_builtin_strxncpy, check_builtin_strncat): Same. handle_builtin_strlen): Use strlen_to_stridx. (strlen_optimize_stmt): Handle flavors of strncat, strncpy, and stpncpy. Use strlen_to_stridx. (pass_strlen::execute): Release strlen_to_stridx. gcc/ada/ChangeLog: PR c/81117 * adadecode.c (__gnat_decode): Replace pointless strncpy with memcpy. * argv.c (__gnat_fill_arg): Same. gcc/c-family/ChangeLog: PR c/81117 * c-common.c (resort_sorted_fields): Replace pointless strncpy with memcpy. * c-warn.c (sizeof_pointer_memaccess_warning): Handle arrays. * c.opt (-Wstriingop-truncation): New option. gcc/fortran/ChangeLog: PR c/81117 * decl.c (build_sym): Replace pointless strncpy with strcpy. gcc/objc/ChangeLog: PR c/81117 * objc-encoding.c (encode_type): Replace pointless strncpy with memcpy. gcc/testsuite/ChangeLog: PR c/81117 * c-c++-common/Wsizeof-pointer-memaccess3.c: New test. * c-c++-common/Wstringop-overflow.c: New test. * c-c++-common/Wstringop-truncation.c: New test. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust. * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same. * gcc.dg/Walloca-1.c: Disable macro tracking. diff --git a/gcc/ada/adadecode.c b/gcc/ada/adadecode.c index 8c9c7ab..0cbef81 100644 --- a/gcc/ada/adadecode.c +++ b/gcc/ada/adadecode.c @@ -330,7 +330,7 @@ __gnat_decode (const char *coded_name, char *ada_name, int verbose) } /* Write symbol in the space.