On Wed, 27 Jun 2018, Jeff Law wrote:

> 
> On 06/18/2018 09:37 AM, Qing Zhao wrote:
> > Hi,
> > 
> > this is the 3rd (and the last) patch for PR78809:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> > Inline strcmp with small constant strings
> > 
> > The design doc for PR78809 is at:
> > https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> > 
> > this patch is for the third part of change of PR78809.
> > 
> > C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
> >    if the result is NOT used to do simple equality test against zero, one of
> > "s1" or "s2" is a small constant string, n is a constant, and the Min value 
> > of
> > the length of the constant string and "n" is smaller than a predefined
> > threshold T,
> >    inline the call by a byte-to-byte comparision sequence to avoid calling
> > overhead.
> > 
> > adding test case strcmpopt_5.c into gcc.dg for part C of PR78809.
> > 
> > bootstraped and tested on both X86 and Aarch64. no regression.
> > 
> > I have done some experiments to check the run-time performance impact 
> > and code-size impact from such inlining with different threshold for the
> > length of the constant string to inline, the data were recorded in the 
> > attachments of 
> > PR78809:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809.
> > 
> > and finally decide that the default value of the threshold is 3. 
> > this value of the threshold can be adjusted by the new option:
> > 
> > --param builtin-string-cmp-inline-length=
> > 
> > The following is the patch.
> > 
> > thanks.
> > 
> > Qing
> > 
> > gcc/ChangeLog:
> > 
> > +2018-06-18  Qing Zhao  <qing.z...@oracle.com>
> > +
> > +       PR middle-end/78809
> > +       * builtins.c (expand_builtin_memcmp): Inline the calls first
> > +       when result_eq is false.
> > +       (expand_builtin_strcmp): Inline the calls first.
> > +       (expand_builtin_strncmp): Likewise.
> > +       (inline_string_cmp): New routine. Expand a string compare 
> > +       call by using a sequence of char comparison.
> > +       (inline_expand_builtin_string_cmp): New routine. Inline expansion
> > +       a call to str(n)cmp/memcmp.
> > +       * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
> > option.
> > +       * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
> > +
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > +2018-06-18  Qing Zhao  <qing.z...@oracle.com>
> > +
> > +       PR middle-end/78809
> > +       * gcc.dg/strcmpopt_5.c: New test.
> So I still need to dig into this patch.  But I wanted to raise an
> potential issue and get yours and Martin's thoughts on it.
> 
> Martin (and others) have been working hard to improve GCC's ability to
> give good diagnostics for various problems with calls to mem* and str*
> functions (buffer overflows, restrict issues, etc).
> 
> One of the problems Martin has identified is early conversion of these
> calls into inlined direct operations.  If those conversions happen prior
> to the analysis for warnings we obviously can't issue any relevant warnings.

>From the looks of the changelog this seems to be done at RTL expansion 
time -- which also makes me question why targets do not provide
cmpstr[n] / cmpmem optabs when doing the inlining is so obviously
beneficial?

Note I didn't look at the actual patch.

Richard.

> 
> > +
> > 
> > 
> > 
> > 0001-3nd-Patch-for-PR78009.patch
> > 
> > 
> > From fcf6ced44e6e3e4a14894cc8f43ac39bc17aafee Mon Sep 17 00:00:00 2001
> > From: qing zhao <qing.z...@oracle.com>
> > Date: Thu, 14 Jun 2018 14:32:46 -0700
> > Subject: [PATCH] 3nd Patch for PR78009
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> > Inline strcmp with small constant strings
> > 
> > The design doc for PR78809 is at:
> > https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> > 
> > this patch is for the third part of change of PR78809.
> > 
> > C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
> >    if the result is NOT used to do simple equality test against zero, one of
> > "s1" or "s2" is a small constant string, n is a constant, and the Min value 
> > of
> > the length of the constant string and "n" is smaller than a predefined
> > threshold T,
> >    inline the call by a byte-to-byte comparision sequence to avoid calling
> > overhead.
> > 
> > adding test case strcmpopt_5.c into gcc.dg for part C of PR78809.
> > 
> > bootstraped and tested on both X86 and Aarch64. no regression.
> > ---
> >  gcc/builtins.c                     | 172 
> > +++++++++++++++++++++++++++++++++++--
> >  gcc/doc/invoke.texi                |   5 ++
> >  gcc/params.def                     |   4 +
> >  gcc/testsuite/gcc.dg/strcmpopt_5.c |  80 +++++++++++++++++
> >  4 files changed, 253 insertions(+), 8 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_5.c
> > 
> > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > index 6b3e6b2..cf50d05 100644
> > --- a/gcc/builtins.c
> > +++ b/gcc/builtins.c
> > @@ -4358,6 +4360,17 @@ expand_builtin_memcmp (tree exp, rtx target, bool 
> > result_eq)
> >                      POINTER_TYPE, POINTER_TYPE, INTEGER_TYPE, VOID_TYPE))
> >      return NULL_RTX;
> >  
> > +  /* due to the performance benefit, always inline the calls first 
> > +     when result_eq is false.  */
> Please capitalize the first word in sentences like this.  This nit
> appears in most of your comments.
> 
> 
> > +  rtx result = NULL_RTX;
> > +   
> > +  if (!result_eq) 
> > +    {
> > +      result = inline_expand_builtin_string_cmp (exp, target, true);
> > +      if (result)
> > +   return result;
> > +    }
> > +
> So I believe you do inline expansion here prior to the checks for
> Wstringop_overflow.  I think you can safely move this code to after the
> warn_stringop_overflow checks.  Though you may want to make this code
> conditional on both calls to check_access returning true and avoiding
> your transformation if either or both calls return false.
> 
> Alternately you'd need to verify that inline_expand_builtin_string_cmp
> always returns false for cases which are going to generate a warning.
> But that seems a bit tougher to maintain over time if we were to add
> more warnings to this code.
> 
> 
> 
> > @@ -4448,6 +4461,12 @@ expand_builtin_strcmp (tree exp, ATTRIBUTE_UNUSED 
> > rtx target)
> >    if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, VOID_TYPE))
> >      return NULL_RTX;
> >  
> > +  /* due to the performance benefit, always inline the calls first.  */
> > +  rtx result = NULL_RTX;
> > +  result = inline_expand_builtin_string_cmp (exp, target, false);
> > +  if (result)
> > +    return result;
> Similarly.
> 
> 
> 
> > @@ -4562,6 +4580,12 @@ expand_builtin_strncmp (tree exp, ATTRIBUTE_UNUSED 
> > rtx target,
> >                      POINTER_TYPE, POINTER_TYPE, INTEGER_TYPE, VOID_TYPE))
> >      return NULL_RTX;
> >  
> > +  /* due to the performance benefit, always inline the calls first.  */
> > +  rtx result = NULL_RTX;
> > +  result = inline_expand_builtin_string_cmp (exp, target, false);
> > +  if (result)
> > +    return result;
> Similarly here.
> 
> > @@ -6640,6 +6664,138 @@ expand_builtin_goacc_parlevel_id_size (tree exp, 
> > rtx target, int ignore)
> >    return target;
> >  }
> >  
> > +/* Expand a string compare operation using a sequence of char comparison 
> > +   to get rid of the calling overhead.
> > +
> > +   var_str is the variable string source;
> > +   const_str is the constant string source;
> > +   length is the number of chars to compare;
> > +   const_str_n indicates which source string is the constant string;
> > +   is_memcmp indicates whether it's a memcmp or strcmp.
> When referring to arguments in comments, please capitalize them.  ie
> VAR_STR, CONST_STR, etc.
> 
> 
> > +   
> > +   to: (assume const_str_n is 2, i.e., arg2 is a constant string)
> > +
> > +   target = var_str[0] - const_str[0];
> > +   if (target != 0)
> > +     goto ne_label;
> > +     ...
> > +   target = var_str[length - 2] - const_str[length - 2];
> > +   if (target != 0)
> > +     goto ne_label;
> > +   target = var_str[length - 1] - const_str[length - 1];
> > +   ne_label:
> > +  */
> But leave them as lower case in code fragments like this.
> 
> 
> So this generally looks pretty good.  THe biggest technical concern is
> making sure we're doing the right thing WRT issuing warnings.  You can
> tackle that problem by deferring inlining to a later point after
> warnings have been issued or by verifying that your routines do not
> inline in cases where warnings will be issued.  It may be worth adding
> testcases for these issues.
> 
> There's a large number of comments that need capitalization fixes.
> 
> Given there was no measured runtime performance impact, but slight
> improvements on codesize for values <= 3, let's go ahead with that as
> the default.
> 
> Can you address the issues above and repost for final review?
> 
> Thanks,
> jeff
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to