On Wed, 2018-02-07 at 17:26 +0000, Nick Clifton wrote:
> Hi Martin, Hi David,
> 
> OK - attached is a new patch that:
> 
>   * Replaces control characters with their escape equivalents.
>   * Includes a testcase.
> 
> I was not sure what to do about the inconsistencies between the
> behaviour of #warning and #pragma GCC warning, (and the error
> equivalents), so I decided to leave it for now.  Fixing them
> will require mucking about in the C preprocessor library, which
> I did not fancy doing at the moment.
> 
> So, is this enhanced patch OK now ?
> 
> Cheers
>   Nick
> 
> gcc/ChangeLog
> 2018-02-07  Nick Clifton  <ni...@redhat.com>
> 
>       PR 84195
>       * tree.c (warn_deprecated_use): Replace control characters in
>       deprecation messages with escape sequences.
> 
> gcc/testsuite/ChangeLog
> 2018-02-07  Nick Clifton  <ni...@redhat.com>
> 
>       * gcc.c-torture/compile/pr84195.c: New test.

> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c        (revision 257451)
> +++ gcc/tree.c        (working copy)
> @@ -12431,8 +12431,66 @@
>    if (attr)
>      attr = lookup_attribute ("deprecated", attr);
>  
> +  char * new_msg = NULL;
> +  unsigned int new_i = 0;
> +
>    if (attr)
> -    msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> +    {
> +      msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> +
> +      if (msg)
> +     {
> +       /* PR 84195: Replace control characters in the message with their
> +          escaped equivalents.  Allow newlines if -fmessage-length has
> +          been set to a non-zero value.  

I'm not quite sure why we allow newlines in this case, sorry.

> This is done here, rather than
> +          where the attribute is recorded as the message length can
> +          change between these two locations.  */
> +
> +       /* FIXME: Do we need to check to see if msg is longer than 2^32 ?  */
> +       unsigned int i;

Please can you split this out into a subroutine, and please add selftests
for it, so that we can unit-test it directly.

Try grepping for e.g. tree_c_tests and other users of selftest.h to see
the idea.

This would be in addition to the DejaGnu-based test (I prefer a "belt and
braces" approach here).

Suggested unittest coverage:

- the empty string
- a non-empty string in which nothing is escaped
- a non-empty string in which everything is escaped
- all of the cases in the switch
- strings with and without newlines, since you're special-casing that

Ownership of the string seems fiddly, so given that gcc is implemented
in C++98, maybe instead of a subroutine, make it a class, with
responsibility for a string that we may or may not own; something like:

  class escaped_string
  {
  public:
     escaped_string (const char *unescaped);
     ~escaped_string () { if (m_owned) free (m_str); }
     operator const char *() const { return m_str; }
  private:
     char *m_str;
     bool m_owned;
  };

so you can (I hope) simply do:

  ASSERT_STREQ ("foo\\nbar", escaped_string ("foo\nbar"));

and similar for the various cases, without needing an explicit "free"
that can get lost if the function ever gains an early return.

(Or, better, have an escape_string function return an instance of a
class that has responsibility for the "if (ownership) free" dtor, but I
can't think of a good name for such a class right now).

Note that "make selftest-valgrind" will run the selftests under
valgrind (sadly, there are a few pre-existing leaks, even if you do
configure gcc with --enable-valgrind-annotations).

> +
> +       for (i = 0; i < strlen (msg); i++)

Just get strlen once, rather than each time through the loop?
"size_t unescaped_len", maybe?

> +         {
> +           char c = msg[i];
> +
> +           if (! ISCNTRL (c))
> +             {
> +               if (new_msg)
> +                 new_msg[new_i++] = c;
> +               continue;
> +             }
> +
> +           if (c != '\n' || ! pp_is_wrapping_line (global_dc->printer))

Could a "bool needs_escaping_p (char)" subroutine clarify the logic
here?

> +             {

Maybe a comment here noting this is the first char needing escaping?

> +               if (new_msg == NULL)
> +                 {
> +                   new_msg = (char *) xmalloc (strlen (msg) * 2);

Off by one, I think - shouldn't this be
  (strlen (msg) * 2) + 1
to allow for every char potentially being escaped, *plus* the NIL-
terminator?

Hence worth having a unittest for the "non-empty and everything is
escaped" case.

(and reuse the one-time strlen from above).

> +                   strncpy (new_msg, msg, i);
> +                   new_i = i;
> +                 }
> +               new_msg [new_i++] = '\\';
> +               switch (c)
> +                 {
> +                 case  7: new_msg[new_i++] = 'a'; break;
> +                 case  8: new_msg[new_i++] = 'b'; break;
> +                 case 27: new_msg[new_i++] = 'e'; break;
> +                 case 12: new_msg[new_i++] = 'f'; break;
> +                 case 10: new_msg[new_i++] = 'n'; break;
> +                 case 13: new_msg[new_i++] = 'r'; break;
> +                 case  9: new_msg[new_i++] = 't'; break;
> +                 case 11: new_msg[new_i++] = 'v'; break;

Can all these case labels be char constants, so e.g.:

    case '\n': new_msg[new_i++] = 'n'; break;

(does that work for every build/host combo?)

> +                 default: new_msg[new_i++] = '?'; break;
> +                 }
> +             }
> +         }
> +
> +       if (new_msg)
> +         {
> +           new_msg[new_i] = 0;
> +           msg = new_msg;
> +         }
> +     }
> +    }


>    else
>      msg = NULL;
>  
> @@ -12505,6 +12563,8 @@
>           }
>       }
>      }
> +
> +  free (new_msg);
>  }
>  
>  /* Return true if REF has a COMPONENT_REF with a bit-field field declaration
> --- /dev/null 2018-02-07 09:26:20.608663241 +0000
> +++ gcc/testsuite/gcc.c-torture/compile/pr84195.c     2018-02-07 
> 17:20:13.494587052 +0000
> @@ -0,0 +1,17 @@
> +/* { dg-options "-Wdeprecated-declarations" } */
> +
> +/* Check that MSG is printed without the escape characters being interpreted.
> +   Especially the newlines.
> +
> +   Note - gcc's behaviour is inconsistent in this regard as #error and
> +   #warning will also display control characters as escape sequences,
> +   whereas #pragma GCC error and #pragma GCC warning will perform the
> +   control operations of the control characters.  */
> +   
> +#define MSG "foo\n\t\rbar"
> +
> +int f (int i __attribute__ ((deprecated (MSG))))
> +{
> +  return 0 ? i : 0; /* { dg-warning "'i' is deprecated: foo.n.t.rbar" } */
> +}

We want the output to be:

  file:line:col: warning: 'i' is deprecated: foo\n\t\rbar [-Wblah-blah]

I think you can use "\\" to signifiy "\" in a DG test, so I *think*
this dg directive can
read:

  return 0 ? i : 0; /* { dg-warning "'i' is deprecated: foo\\n\\t\\rbar" } */


Hope this is constructive
Dave

Reply via email to