Re: RFC: Provide diagnostic hints for missing inttypes.h string constants.

2020-05-22 Thread David Malcolm via Gcc-patches
On Fri, 2020-05-22 at 01:30 +0200, Mark Wielaard wrote:
Hi Mark

> This is on top of the stdbool.h and stdint.h patches.

Sorry, I didn't see those patches; I've replied to them now.

> This adds a flag to c_parser so we know when we were trying to
> constract a string literal. If there is a parse error and we were
> constructing a string literal, and the next token is an unknown
> identifier name, and we know there is a standard header that defines
> that name as a string literal, then add a missing header hint to
> the error messsage.

By comparison, what's the existing behavior for the example?
Is it just what you posted below, but without the "note" diagnostics?

> I haven't checked yet if we can do something similar for the C++
> parser. And the testcase needs to be extended a bit. But I hope the
> direction is OK.

I think it's a worthy goal; as noted below I'd want a C frontend
maintainer to approve those parts of it.

> For the following source:
> 
> const char *hex64_fmt =  PRIx64;
> const char *msg_fmt = "Provide %" PRIx64 "\n";
> 
> void foo (uint32_t v)
> {
>   printf ("%" PRIu32, v);
> }
> 
> We will get the following:
> 
> $ /opt/local/gcc/bin/gcc -c t.c
> t.c:4:26: error: ‘PRIx64’ undeclared here (not in a function)
> 4 | const char *hex64_fmt =  PRIx64;
>   |  ^~
> t.c:3:1: note: ‘PRIx64’ is defined in header ‘’; did you forget 
> to ‘#include ’?
> 2 | #include 
>   +++ |+#include 
> 3 |
> t.c:5:35: error: expected ‘,’ or ‘;’ before ‘PRIx64’
> 5 | const char *msg_fmt = "Provide %" PRIx64 "\n";
>   |   ^~
> t.c:5:35: note: ‘PRIx64’ is defined in header ‘’; did you forget 
> to ‘#include ’?

It's a big improvement, but it's still a little clunky.  The
  expected ‘,’ or ‘;’ before ‘PRIx64’
is the sort of thing we've gotten used to with years of GCC messages,
but might make little sense to a neophyte.
I wonder if it's possible to improve this further, but I fear that that
might overcomplicate things (if I understand things correctly, the
string concatenation fails in the tokenizer, and then the PRIx64 fails
in the parser, so we have a bad interaction involving two different
levels of abstraction within the frontend - I think).

> t.c: In function ‘foo’:
> t.c:9:14: error: expected ‘)’ before ‘PRIu32’
> 9 |   printf ("%" PRIu32, v);
>   |  ^~~
>   |  )
> t.c:9:15: note: ‘PRIu32’ is defined in header ‘’; did you forget 
> to ‘#include ’?
> 9 |   printf ("%" PRIu32, v);
>   |   ^~
> ---

[...snip...]
  
> +/* These can be used as string macros or as identifiers. Must all be
> +   string-literals.  Used in get_stdlib_header_for_name and
> +   get_c_stdlib_header_for_string_macro_name.  */
> +static const stdlib_hint c99_cxx11_macro_hints[] = {
> +/*  and .  */
> +{"PRId8", {"", ""} },
> +{"PRId16", {"", ""} },
> +{"PRId32", {"", ""} },
> +{"PRId64", {"", ""} },
> +{"PRIi8", {"", ""} },
> +{"PRIi16", {"", ""} },
> +{"PRIi32", {"", ""} },
> +{"PRIi64", {"", ""} },
> +{"PRIo8", {"", ""} },
> +{"PRIo16", {"", ""} },
> +{"PRIo32", {"", ""} },
> +{"PRIo64", {"", ""} },
> +{"PRIu8", {"", ""} },
> +{"PRIu16", {"", ""} },
> +{"PRIu32", {"", ""} },
> +{"PRIu64", {"", ""} },
> +{"PRIx8", {"", ""} },
> +{"PRIx16", {"", ""} },
> +{"PRIx32", {"", ""} },
> +{"PRIx64", {"", ""} },
> +{"PRIX8", {"", ""} },
> +{"PRIX16", {"", ""} },
> +{"PRIX32", {"", ""} },
> +{"PRIX64", {"", ""} },
> +
> +{"PRIdPTR", {"", ""} },
> +{"PRIiPTR", {"", ""} },
> +{"PRIoPTR", {"", ""} },
> +{"PRIuPTR", {"", ""} },
> +{"PRIxPTR", {"", ""} },
> +{"PRIXPTR", {"", ""} },
> +
> +{"SCNd8", {"", ""} },
> +{"SCNd16", {"", ""} },
> +{"SCNd32", {"", ""} },
> +{"SCNd64", {"", ""} },
> +{"SCNi8", {"", ""} },
> +{"SCNi16", {"", ""} },
> +{"SCNi32", {"", ""} },
> +{"SCNi64", {"", ""} },
> +{"SCNo8", {"", ""} },
> +{"SCNo16", {"", ""} },
> +{"SCNo32", {"", ""} },
> +{"SCNo64", {"", ""} },
> +{"SCNu8", {"", ""} },
> +{"SCNu16", {"", ""} },
> +{"SCNu32", {"", ""} },
> +{"SCNu64", {"", ""} },
> +{"SCNx8", {"", ""} },
> +{"SCNx16", {"", ""} },
> +{"SCNx32", {"", ""} },
> +{"SCNx64", {"", ""} },
> +
> +{"SCNdPTR", {"", ""} },
> +{"SCNiPTR", {"", ""} },
> +{"SCNoPTR", {"", ""} },
> +{"SCNuPTR", {"", ""} },
> +{"SCNxPTR", {"", ""} }
> +};

I found myself squinting at the array trying to decide if every entry
had
   {"", ""}
as its second element.  I *think* that's the case, right? 

I know there's a lot of pre-existing duplication in this file, but
maybe this would be better as simply an array of const char *?
It could probably be reformatted to take up far fewer lines by grouping
them logically.

Maybe have a

  static const char *
  get_c99_cxx11_macro_hint (const char *, enum stdlib 

RFC: Provide diagnostic hints for missing inttypes.h string constants.

2020-05-21 Thread Mark Wielaard
This is on top of the stdbool.h and stdint.h patches.

This adds a flag to c_parser so we know when we were trying to
constract a string literal. If there is a parse error and we were
constructing a string literal, and the next token is an unknown
identifier name, and we know there is a standard header that defines
that name as a string literal, then add a missing header hint to
the error messsage.

I haven't checked yet if we can do something similar for the C++
parser. And the testcase needs to be extended a bit. But I hope the
direction is OK.

For the following source:

const char *hex64_fmt =  PRIx64;
const char *msg_fmt = "Provide %" PRIx64 "\n";

void foo (uint32_t v)
{
  printf ("%" PRIu32, v);
}

We will get the following:

$ /opt/local/gcc/bin/gcc -c t.c
t.c:4:26: error: ‘PRIx64’ undeclared here (not in a function)
4 | const char *hex64_fmt =  PRIx64;
  |  ^~
t.c:3:1: note: ‘PRIx64’ is defined in header ‘’; did you forget to 
‘#include ’?
2 | #include 
  +++ |+#include 
3 |
t.c:5:35: error: expected ‘,’ or ‘;’ before ‘PRIx64’
5 | const char *msg_fmt = "Provide %" PRIx64 "\n";
  |   ^~
t.c:5:35: note: ‘PRIx64’ is defined in header ‘’; did you forget to 
‘#include ’?
t.c: In function ‘foo’:
t.c:9:14: error: expected ‘)’ before ‘PRIu32’
9 |   printf ("%" PRIu32, v);
  |  ^~~
  |  )
t.c:9:15: note: ‘PRIu32’ is defined in header ‘’; did you forget to 
‘#include ’?
9 |   printf ("%" PRIu32, v);
  |   ^~
---
 gcc/c-family/known-headers.cc  | 88 ++
 gcc/c-family/known-headers.h   |  2 +
 gcc/c/c-parser.c   | 29 +++
 gcc/testsuite/gcc.dg/spellcheck-inttypes.c | 32 
 4 files changed, 151 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-inttypes.c

diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
index 1e2bf49c439a..b9772af21863 100644
--- a/gcc/c-family/known-headers.cc
+++ b/gcc/c-family/known-headers.cc
@@ -46,6 +46,71 @@ struct stdlib_hint
   const char *header[NUM_STDLIBS];
 };
 
+/* These can be used as string macros or as identifiers. Must all be
+   string-literals.  Used in get_stdlib_header_for_name and
+   get_c_stdlib_header_for_string_macro_name.  */
+static const stdlib_hint c99_cxx11_macro_hints[] = {
+/*  and .  */
+{"PRId8", {"", ""} },
+{"PRId16", {"", ""} },
+{"PRId32", {"", ""} },
+{"PRId64", {"", ""} },
+{"PRIi8", {"", ""} },
+{"PRIi16", {"", ""} },
+{"PRIi32", {"", ""} },
+{"PRIi64", {"", ""} },
+{"PRIo8", {"", ""} },
+{"PRIo16", {"", ""} },
+{"PRIo32", {"", ""} },
+{"PRIo64", {"", ""} },
+{"PRIu8", {"", ""} },
+{"PRIu16", {"", ""} },
+{"PRIu32", {"", ""} },
+{"PRIu64", {"", ""} },
+{"PRIx8", {"", ""} },
+{"PRIx16", {"", ""} },
+{"PRIx32", {"", ""} },
+{"PRIx64", {"", ""} },
+{"PRIX8", {"", ""} },
+{"PRIX16", {"", ""} },
+{"PRIX32", {"", ""} },
+{"PRIX64", {"", ""} },
+
+{"PRIdPTR", {"", ""} },
+{"PRIiPTR", {"", ""} },
+{"PRIoPTR", {"", ""} },
+{"PRIuPTR", {"", ""} },
+{"PRIxPTR", {"", ""} },
+{"PRIXPTR", {"", ""} },
+
+{"SCNd8", {"", ""} },
+{"SCNd16", {"", ""} },
+{"SCNd32", {"", ""} },
+{"SCNd64", {"", ""} },
+{"SCNi8", {"", ""} },
+{"SCNi16", {"", ""} },
+{"SCNi32", {"", ""} },
+{"SCNi64", {"", ""} },
+{"SCNo8", {"", ""} },
+{"SCNo16", {"", ""} },
+{"SCNo32", {"", ""} },
+{"SCNo64", {"", ""} },
+{"SCNu8", {"", ""} },
+{"SCNu16", {"", ""} },
+{"SCNu32", {"", ""} },
+{"SCNu64", {"", ""} },
+{"SCNx8", {"", ""} },
+{"SCNx16", {"", ""} },
+{"SCNx32", {"", ""} },
+{"SCNx64", {"", ""} },
+
+{"SCNdPTR", {"", ""} },
+{"SCNiPTR", {"", ""} },
+{"SCNoPTR", {"", ""} },
+{"SCNuPTR", {"", ""} },
+{"SCNxPTR", {"", ""} }
+};
+
 /* Given non-NULL NAME, return the header name defining it within either
the standard library (with '<' and '>'), or NULL.
Only handles a subset of the most common names within the stdlibs.  */
@@ -196,6 +261,14 @@ get_stdlib_header_for_name (const char *name, enum stdlib 
lib)
   if (strcmp (name, c99_cxx11_hints[i].name) == 0)
return c99_cxx11_hints[i].header[lib];
 
+  const size_t num_c99_cxx11_macro_hints
+= sizeof (c99_cxx11_macro_hints) / sizeof (c99_cxx11_macro_hints[0]);
+  if ((lib == STDLIB_C && flag_isoc99)
+  || (lib == STDLIB_CPLUSPLUS && cxx_dialect >= cxx11 ))
+for (size_t i = 0; i < num_c99_cxx11_macro_hints; i++)
+  if (strcmp (name, c99_cxx11_macro_hints[i].name) == 0)
+   return c99_cxx11_macro_hints[i].header[lib];
+
   return NULL;
 }
 
@@ -217,6 +290,21 @@ get_cp_stdlib_header_for_name (const char *name)
   return get_stdlib_header_for_name (name, STDLIB_CPLUSPLUS);
 }
 
+/* Given non-NULL NAME, return the header name