Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage

David Malcolm Mon, 08 Jun 2015 11:19:11 -0700

On Sun, 2015-06-07 at 16:06 -0400, Patrick Palka wrote:
> This patch refactors the entry point of -Wmisleading-indentation


Thanks for working on this.  I was hoping to submit a patch to propose
putting -Wmisleading-indentation into -Wall, and have been testing the
warning on the linux kernel to try to shake out false positives; hence
I'm somewhat nervous about this change, though it mostly seems
reasonable.

I think there are three categories of message that
-Wmisleading-indentation can emit:

  (A) bogus messages, where the message is untrue.  An example of this
was PR c/66220 (I found an example of this when running it on the linux
kernel); this is fixed in trunk.

  (B) a message where the code is doing what the author/reviewer
intended, but the indentation misleadingly suggests a different meaning
(or perhaps the author is being underhanded c.f.
http://www.underhanded-c.org/ ).

I think it would be reasonable to warn about (B) in -Wall: invoke.texi
describes -Wall as

   "...enables all the warnings about constructions that some users
consider questionable, and that are easy to avoid (or modify to prevent
the warning), even in conjunction with macros" 

hence I believe (B) falls into this category.  Each instance is a
"gotcha" lurking in the code for the next maintainer.

  (C) a message where the code is doing something differently to what
the author intended, and the human has been fooled by the bogus
indentation.  Clearly you'd want to know about (C), but the compiler has
no way of distinguishing it from (B).

[FWIW, in my testing on the linux kernel, I ran into:

  * 1 instance of (A) (PR c/66220, now fixed in trunk), and

  * 8 that are each either (B) or (C), and I don't have the domain
knowledge of the kernel to figure out which is which, so I've started
reporting them to the upstream linux community; see e.g.:
    * https://bugzilla.kernel.org/show_bug.cgi?id=98231
    * https://bugzilla.kernel.org/show_bug.cgi?id=98241
    * https://bugzilla.kernel.org/show_bug.cgi?id=98251
    * https://bugzilla.kernel.org/show_bug.cgi?id=98261
    * https://bugzilla.kernel.org/show_bug.cgi?id=98271
    * https://bugzilla.kernel.org/show_bug.cgi?id=98281

  * 2 messages that are confirmed as (B), e.g.
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=e1395a321eab1a7833d82e952eb8255e0a1f03cb

]

I believe that gcc trunk is currently bootstrappable with
-Wmisleading-indentation added to -Wall; is this still the case after
your patch?  Also, would it is it fair to ask for you to test your
patched compiler on some body of code that isn't gcc itself?

More comments inline below, throughout.

> from:
> 
>   void
>   warn_for_misleading_indentation (location_t guard_loc,
>                                    location_t body_loc,
>                                    location_t next_stmt_loc,
>                                    enum cpp_ttype next_tok_type,
>                                    const char *guard_kind);
> 
> to
> 
>   struct common_token_info
>   {
>     location_t location;
>     cpp_ttype type;
>     rid keyword;
>   };

Bikeshedding a bit, but would, say, "token_indent_info" or
"token_location_info" be better?  (I'm not sure)  What do you mean by
"common" in the title?

>   void
>   warn_for_misleading_indentation (const common_token_info &guard_tinfo,
>                                    const common_token_info &body_tinfo,
>                                    const common_token_info &next_tinfo);

[Do we allow C++ reference syntax?   I'm OK with it, and some of the
more "C++y" parts of codebase use it (templates), but I think Jeff
objected last time I tried to submit a patch with it :) ]

> The purpose of this refactoring is to expose more information to the
> -Wmisleading-indentation implementation to allow for more advanced
> heuristics and for better coverage.


> This change of API of course required changes to the C and C++ frontend.
> Amidst these minor changes I also made sure in both frontends that
> warn_for_misleading_indentation() gets called even when the body of the
> guard in question is a semicolon.  This allows us to implement warnings
> about dubious semicolons.  Also I moved the keyword != RID_ELSE checks
> guarding the call to warn_for_misleading_indentation() to the function
> itself itself.
> 
> At the same time this patch extends the coverage of the
> -Wmisleading-indentation implementation to catch misleading indentation
> involving the semicolon.  (These changes are all contained in
> c-indentation.c.)  For example, now we warn on the following code
> samples:
> 
>   if (flag);
>     foo ();
> 
>   while (flag);
>   {
>     ...
>   }
> 
>   if (flag); {
>     ...
>   }
> 
>   if (flag)
>     ; /* blah */ {
>       ...
>     }
> 
>   if (flag); foo ();
> 
> while avoiding to warn on code that is poorly indented but not
> misleadingly so;
> 
>   while (flag);
>   foo ();
> 
>   while (flag)
>     ;
>     foo ();
> 
>   if (flag1)
>     ;
>   if (flag)
>     ;
>   else
>    ...

(nods; I want the warning to warn about *misleading* indentation, rather
than just bad indentation, since the former is reasonable to report
about; the latter has more risk of becoming tedious and annoying users).

> Finally this patch reverts the fix to PR 66620 because it regresses the
> following test cases:
> 
>   if (foo)
>     bar ();
>   else if (baz)
>     bar ();
>     bar (); // No warning, because guard_vis_column > body_vis_column

The patch doesn't appear to add a testcase specifically for the above;
the closest I could see was:

> +  if (flagA)
> +    ;
> +  else if (flagA) /* { dg-message "8: ...this 'if' clause" } */
> +    foo (0);
> +    foo (0); /* { dg-warning "statement is indented as if" } */

(unless I missed it; sorry if that's the case).  Given that the
heuristics are already non-trivial, could the patch please include a
testcase specifically for each of the examples you cite in this email?
I'd rather have duplicate testing than risk missing some cases (it's one
file, so it shouldn't slow down "make check").

> The better fix for this PR IMO is to bail out early when the the body is
> a braced compound statement.  This is what the patch does.  In general,
> when the body of the guard in question is braced then control flow is
> visually explicit despite any potentially sloppy indentation.  This
> restriction could always be relaxed however.

This seems reasonable to me.

> Bootstrap and regtest on x86_64-linux in progress, OK to commit if succeeds?
> 
> gcc/c-family/ChangeLog:
> 
>       * c-common.h (struct common_token_info): Define.
>       (extract_token_info): Define.
>       (warn_for_misleading_information): Change declaration to take
>       three common_token_infos.
>       * c-identation.c (should_warn_for_misleading_indentation):
>       Likewise.  Bail out early if the body is a compound statement or
>       if the next token is the else keyword.  Float out the definition
>       of guard_exploc.  Don't warn on code containing a semicolon body
>       on its own line.  Remove fix for PR 66220.  Add heuristics for
>       when a semicolon body is on the same line as the guard.
>       (guard_tinfo_to_string): Define.
>       (warn_for_misleading_indentation): Change declaration to take
>       three common_token_infos.  Adjust accordingly.
> 
> gcc/c/ChangeLog:
> 
>       * c-parser.c (c_parser_if_body): Call
>       warn_for_misleading_indentation even when the body is a
>       semicolon.  Extract common_token_infos corresponding to the
>       guard, body and next tokens.  Adjust call to
>       warn_for_misleading_indentation accordingly.
>       (c_parser_else_body): Likewise.
>       (c_parser_if_statement): Likewise.
>       (c_parser_while_statement): Likewise.
>       (c_parser_for_statement): Likewise.
> 
> gcc/cp/ChangeLog:
> 
>       * parser.c (cp_parser_selection_statement): Move handling of
>       semicolon body to ...
>       (cp_parser_implicitly_scoped_statement): .. here.  Call
>       warn_for_misleading_indentation even when the body is a
>       semicolon.  Extract common_token_infos corresponding to the
>       guard, body and next tokens.  Adjust call to
>       warn_for_misleading_indentation accordingly.
>       (cp_parser_already_scoped_statement): Likewise.
>       (cp_parser_selection_statement, cp_parser_iteration_statement):
>       Extract a common_token_info corresponding to the guard token.
> 
> gcc/testsuite/ChangeLog:
> 
>       * c-c++-common/Wmisleading-indentation.c: Augment test.
> ---
>  gcc/c-family/c-common.h                            |  34 ++++-
>  gcc/c-family/c-indentation.c                       | 144 
> ++++++++++++++++-----
>  gcc/c/c-parser.c                                   |  82 ++++++------
>  gcc/cp/parser.c                                    |  99 ++++++--------
>  .../c-c++-common/Wmisleading-indentation.c         |  99 ++++++++++++++
>  5 files changed, 325 insertions(+), 133 deletions(-)
> 
> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index af86e26..f152de6 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1435,12 +1435,36 @@ extern bool contains_cilk_spawn_stmt (tree);
>  extern tree cilk_for_number_of_iterations (tree);
>  extern bool check_no_cilk (tree, const char *, const char *,
>                          location_t loc = UNKNOWN_LOCATION);
> +
> +
> +/* A subset of token information that is common between the C and C++ 
> frontends.  */
> +
> +struct common_token_info
> +{
> +  location_t location;
> +  ENUM_BITFIELD (cpp_ttype) type : 8;
> +  ENUM_BITFIELD (rid) keyword : 8;
> +};
> +
> +/* Extract common token information from TOKEN, which ought to either be a
> +   cp_token * or a c_token *.  */
> +
> +template <typename T>
> +inline common_token_info
> +extract_token_info (const T *token)
> +{
> +  common_token_info info;
> +  info.location = token->location;
> +  info.type = token->type;
> +  info.keyword = token->keyword;
> +
> +  return info;
> +}
> +
>  /* In c-indentation.c.  */
>  extern void
> -warn_for_misleading_indentation (location_t guard_loc,
> -                              location_t body_loc,
> -                              location_t next_stmt_loc,
> -                              enum cpp_ttype next_tok_type,
> -                              const char *guard_kind);
> +warn_for_misleading_indentation (const common_token_info &guard_tinfo,
> +                              const common_token_info &body_tinfo,
> +                              const common_token_info &next_tinfo);
>  
>  #endif /* ! GCC_C_COMMON_H */
> diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
> index 5c78e18..d3e842b 100644
> --- a/gcc/c-family/c-indentation.c
> +++ b/gcc/c-family/c-indentation.c
> @@ -190,11 +190,17 @@ detect_preprocessor_logic (expanded_location 
> body_exploc,
>     description of that function below.  */
>  
>  static bool
> -should_warn_for_misleading_indentation (location_t guard_loc,
> -                                     location_t body_loc,
> -                                     location_t next_stmt_loc,
> -                                     enum cpp_ttype next_tok_type)
> +should_warn_for_misleading_indentation (const common_token_info &guard_tinfo,
> +                                     const common_token_info &body_tinfo,
> +                                     const common_token_info &next_tinfo)
>  {
> +  location_t guard_loc = guard_tinfo.location;
> +  location_t body_loc = body_tinfo.location;
> +  location_t next_stmt_loc = next_tinfo.location;
> +
> +  enum cpp_ttype body_type = body_tinfo.type;
> +  enum cpp_ttype next_tok_type = next_tinfo.type;
> +
>    /* Don't attempt to compare the indentation of BODY_LOC and NEXT_STMT_LOC
>       if either are within macros.  */
>    if (linemap_location_from_macro_expansion_p (line_table, body_loc)
> @@ -220,7 +226,9 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>    if (line_table->seen_line_directive)
>      return false;
>  
> -  if (next_tok_type == CPP_CLOSE_BRACE)
> +  if (body_type == CPP_OPEN_BRACE
> +      || next_tok_type == CPP_CLOSE_BRACE
> +      || next_tinfo.keyword == RID_ELSE)
>      return false;

Perhaps it's hypocritical of me to ask this given that it didn't have
one already, but could the above guard gain a comment describing the
intent?  (ideally with examples of what we reject here?)

>    /* Don't warn here about spurious semicolons.  */
> @@ -229,6 +237,7 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>  
>    expanded_location body_exploc = expand_location (body_loc);
>    expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
> +  expanded_location guard_exploc = expand_location (guard_loc);
>  
>    /* They must be in the same file.  */
>    if (next_stmt_exploc.file != body_exploc.file)
> @@ -246,13 +255,20 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>         if (flag) foo (); bar ();
>                           ^ WARN HERE
>  
> +
> +       if (flag) ; {
> +                   ^ WARN HERE
> +
> +       if (flag)
> +        ; {
> +          ^ WARN HERE
> +
>       Cases where we don't want to issue a warning:
>  
>         various_code (); if (flag) foo (); bar (); more_code ();
>                                            ^ DON'T WARN HERE.  */
>    if (next_stmt_exploc.line == body_exploc.line)
>      {
> -      expanded_location guard_exploc = expand_location (guard_loc);
>        if (guard_exploc.file != body_exploc.file)
>       return true;
>        if (guard_exploc.line < body_exploc.line)
> @@ -294,14 +310,10 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>         bar ();
>         ^ DON'T WARN HERE
>  
> -        if (flag) {
> -          foo ();
> -        } else
> -        {
> -          bar ();
> -        }
> -        baz ();
> -        ^ DON'T WARN HERE

I notice you delete part of the comment here; where is this case now
rejected?  Can that go in a comment please?  (I believe that it's now at
the clause I highlighted above)

> +     if (flag)
> +       ;
> +       foo ();
> +       ^ DON'T WARN HERE
>    */
>    if (next_stmt_exploc.line > body_exploc.line)
>      {
> @@ -317,23 +329,22 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>       return false;
>        if (!get_visual_column (body_exploc, &body_vis_column))
>       return false;
> -      if (next_stmt_vis_column == body_vis_column)
> +
> +      /* Don't emit a warning when the semicolon body is on its own line.  */
> +      if (body_exploc.line > guard_exploc.line
> +       && body_type == CPP_SEMICOLON)
> +     ;
> +       else if (next_stmt_vis_column == body_vis_column)
>       {
>         /* Don't warn if they aren't aligned on the same column
>            as the guard itself (suggesting autogenerated code that
>            doesn't bother indenting at all).  */
> -       expanded_location guard_exploc = expand_location (guard_loc);
>         unsigned int guard_vis_column;
>         if (!get_visual_column (guard_exploc, &guard_vis_column))
>           return false;
>         if (guard_vis_column == body_vis_column)
>           return false;
>  
> -       /* PR 66220: Don't warn if the guarding statement is more
> -          indented than the next/body stmts.  */
> -       if (guard_vis_column > body_vis_column)
> -         return false;
> -
>         /* Don't warn if there is multiline preprocessor logic between
>            the two statements. */
>         if (detect_preprocessor_logic (body_exploc, next_stmt_exploc))
> @@ -342,11 +353,76 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>         /* Otherwise, they are visually aligned: issue a warning.  */
>         return true;
>       }
> +
> +     /* Also issue a warning for code having the form:
> +
> +        if (flag);
> +          foo ();
> +
> +        while (flag);
> +        {
> +          ...
> +        }
> +
> +        for (...);
> +          {
> +            ...
> +          }
> +
> +        if (flag)
> +          ;
> +        else if (flag);
> +          foo ();
> +
> +        where the semicolon at the end of each guard is most likely spurious.
> +
> +        But do not warn on:
> +
> +        for (..);
> +        foo ();
> +
> +        where the next statement is aligned with the guard.
> +     */
> +     if (body_type == CPP_SEMICOLON)
> +       {
> +         if (body_exploc.line == guard_exploc.line)
> +           {
> +             unsigned int guard_vis_column;
> +             if (!get_visual_column (guard_exploc, &guard_vis_column))
> +               return false;
> +             if (next_stmt_vis_column != guard_vis_column
> +                 || (next_tok_type == CPP_OPEN_BRACE
> +                     && next_stmt_vis_column == guard_vis_column))
> +               return true;
> +           }
> +       }
>      }
>  
>    return false;
>  }
>  
> +/* Return the string indentifier corresponding to the guard token.  */

Typo: s/indentifier/identifier/


> +static const char *
> +guard_tinfo_to_string (const common_token_info &guard_tinfo)
> +{
> +  switch (guard_tinfo.keyword)
> +    {
> +    case RID_FOR:
> +      return "for";
> +    case RID_ELSE:
> +      return "else";
> +    case RID_IF:
> +      return "if";
> +    case RID_WHILE:
> +      return "while";
> +    case RID_DO:
> +      return "do";
> +    default:
> +      return "";

Should the default have a "gcc_unreachable ();", or is this indeed
reachable?

> +    }
> +}
> +
>  /* Called by the C/C++ frontends when we have a guarding statement at
>     GUARD_LOC containing a statement at BODY_LOC, where the block wasn't
>     written using braces, like this:
> @@ -374,11 +450,9 @@ should_warn_for_misleading_indentation (location_t 
> guard_loc,
>     GUARD_KIND identifies the kind of clause e.g. "if", "else" etc.  */
>  
>  void
> -warn_for_misleading_indentation (location_t guard_loc,
> -                              location_t body_loc,
> -                              location_t next_stmt_loc,
> -                              enum cpp_ttype next_tok_type,
> -                              const char *guard_kind)
> +warn_for_misleading_indentation (const common_token_info &guard_tinfo,
> +                              const common_token_info &body_tinfo,
> +                              const common_token_info &next_tinfo)
>  {
>    /* Early reject for the case where -Wmisleading-indentation is disabled,
>       to avoid doing work only to have the warning suppressed inside the
> @@ -386,12 +460,14 @@ warn_for_misleading_indentation (location_t guard_loc,
>    if (!warn_misleading_indentation)
>      return;
>  
> -  if (should_warn_for_misleading_indentation (guard_loc,
> -                                           body_loc,
> -                                           next_stmt_loc,
> -                                           next_tok_type))
> -    if (warning_at (next_stmt_loc, OPT_Wmisleading_indentation,
> -                 "statement is indented as if it were guarded by..."))
> -      inform (guard_loc,
> -           "...this %qs clause, but it is not", guard_kind);
> +  if (should_warn_for_misleading_indentation (guard_tinfo,
> +                                           body_tinfo,
> +                                           next_tinfo))
> +    {
> +      if (warning_at (next_tinfo.location, OPT_Wmisleading_indentation,
> +                   "statement is indented as if it were guarded by..."))
> +        inform (guard_tinfo.location,
> +             "...this %qs clause, but it is not",
> +             guard_tinfo_to_string (guard_tinfo));
> +    }
>  }
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index c691e76..7705bd4 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -5193,10 +5193,14 @@ c_parser_c99_block_statement (c_parser *parser)
>     parser->in_if_block.  */
>  
>  static tree
> -c_parser_if_body (c_parser *parser, bool *if_p, location_t if_loc)
> +c_parser_if_body (c_parser *parser, bool *if_p,
> +               const common_token_info &if_tinfo)
>  {
>    tree block = c_begin_compound_stmt (flag_isoc99);
>    location_t body_loc = c_parser_peek_token (parser)->location;
> +  common_token_info body_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +
>    c_parser_all_labels (parser);
>    *if_p = c_parser_next_token_is_keyword (parser, RID_IF);
>    if (c_parser_next_token_is (parser, CPP_SEMICOLON))
> @@ -5211,14 +5215,11 @@ c_parser_if_body (c_parser *parser, bool *if_p, 
> location_t if_loc)
>    else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
>      add_stmt (c_parser_compound_statement (parser));
>    else
> -    {
> -      c_parser_statement_after_labels (parser);
> -      if (!c_parser_next_token_is_keyword (parser, RID_ELSE))
> -     warn_for_misleading_indentation (if_loc, body_loc,
> -                                      c_parser_peek_token (parser)->location,
> -                                      c_parser_peek_token (parser)->type,
> -                                      "if");
> -    }
> +    c_parser_statement_after_labels (parser);
> +
> +  common_token_info next_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +  warn_for_misleading_indentation (if_tinfo, body_tinfo, next_tinfo);
>  
>    return c_end_compound_stmt (body_loc, block, flag_isoc99);
>  }
> @@ -5228,28 +5229,30 @@ c_parser_if_body (c_parser *parser, bool *if_p, 
> location_t if_loc)
>     specially for the sake of -Wempty-body warnings.  */
>  
>  static tree
> -c_parser_else_body (c_parser *parser, location_t else_loc)
> +c_parser_else_body (c_parser *parser, const common_token_info &else_tinfo)
>  {
>    location_t body_loc = c_parser_peek_token (parser)->location;
>    tree block = c_begin_compound_stmt (flag_isoc99);
> +  common_token_info body_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +
>    c_parser_all_labels (parser);
>    if (c_parser_next_token_is (parser, CPP_SEMICOLON))
>      {
>        location_t loc = c_parser_peek_token (parser)->location;
>        warning_at (loc,
>                 OPT_Wempty_body,
> -              "suggest braces around empty body in an %<else%> statement");
> +               "suggest braces around empty body in an %<else%> statement");

Was this purely a whitespace change?  If so, probably leave it for a
different commit.

>        add_stmt (build_empty_stmt (loc));
>        c_parser_consume_token (parser);
>      }
>    else
> -    {
> -      c_parser_statement_after_labels (parser);
> -      warn_for_misleading_indentation (else_loc, body_loc,
> -                                    c_parser_peek_token (parser)->location,
> -                                    c_parser_peek_token (parser)->type,
> -                                    "else");
> -    }
> +    c_parser_statement_after_labels (parser);
> +
> +  common_token_info next_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +  warn_for_misleading_indentation (else_tinfo, body_tinfo, next_tinfo);
> +
>    return c_end_compound_stmt (body_loc, block, flag_isoc99);
>  }
>  
> @@ -5272,7 +5275,8 @@ c_parser_if_statement (c_parser *parser)
>    tree if_stmt;
>  
>    gcc_assert (c_parser_next_token_is_keyword (parser, RID_IF));
> -  location_t if_loc = c_parser_peek_token (parser)->location;
> +  common_token_info if_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
>    c_parser_consume_token (parser);
>    block = c_begin_compound_stmt (flag_isoc99);
>    loc = c_parser_peek_token (parser)->location;
> @@ -5284,13 +5288,14 @@ c_parser_if_statement (c_parser *parser)
>      }
>    in_if_block = parser->in_if_block;
>    parser->in_if_block = true;
> -  first_body = c_parser_if_body (parser, &first_if, if_loc);
> +  first_body = c_parser_if_body (parser, &first_if, if_tinfo);
>    parser->in_if_block = in_if_block;
>    if (c_parser_next_token_is_keyword (parser, RID_ELSE))
>      {
> -      location_t else_loc = c_parser_peek_token (parser)->location;
> +      common_token_info else_tinfo
> +     = extract_token_info (c_parser_peek_token (parser));
>        c_parser_consume_token (parser);
> -      second_body = c_parser_else_body (parser, else_loc);
> +      second_body = c_parser_else_body (parser, else_tinfo);
>      }
>    else
>      second_body = NULL_TREE;
> @@ -5370,7 +5375,8 @@ c_parser_while_statement (c_parser *parser, bool ivdep)
>    tree block, cond, body, save_break, save_cont;
>    location_t loc;
>    gcc_assert (c_parser_next_token_is_keyword (parser, RID_WHILE));
> -  location_t while_loc = c_parser_peek_token (parser)->location;
> +  common_token_info while_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
>    c_parser_consume_token (parser);
>    block = c_begin_compound_stmt (flag_isoc99);
>    loc = c_parser_peek_token (parser)->location;
> @@ -5388,14 +5394,14 @@ c_parser_while_statement (c_parser *parser, bool 
> ivdep)
>    save_cont = c_cont_label;
>    c_cont_label = NULL_TREE;
>  
> -  location_t body_loc = UNKNOWN_LOCATION;
> -  if (c_parser_peek_token (parser)->type != CPP_OPEN_BRACE)
> -    body_loc = c_parser_peek_token (parser)->location;
> +  common_token_info body_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +
>    body = c_parser_c99_block_statement (parser);
> -  warn_for_misleading_indentation (while_loc, body_loc,
> -                                c_parser_peek_token (parser)->location,
> -                                c_parser_peek_token (parser)->type,
> -                                "while");
> +
> +  common_token_info next_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +  warn_for_misleading_indentation (while_tinfo, body_tinfo, next_tinfo);
>  
>    c_finish_loop (loc, cond, NULL, body, c_break_label, c_cont_label, true);
>    add_stmt (c_end_compound_stmt (loc, block, flag_isoc99));
> @@ -5515,6 +5521,8 @@ c_parser_for_statement (c_parser *parser, bool ivdep)
>    location_t for_loc = c_parser_peek_token (parser)->location;
>    bool is_foreach_statement = false;
>    gcc_assert (c_parser_next_token_is_keyword (parser, RID_FOR));
> +  common_token_info for_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
>    c_parser_consume_token (parser);
>    /* Open a compound statement in Objective-C as well, just in case this is
>       as foreach expression.  */
> @@ -5675,14 +5683,14 @@ c_parser_for_statement (c_parser *parser, bool ivdep)
>    save_cont = c_cont_label;
>    c_cont_label = NULL_TREE;
>  
> -  location_t body_loc = UNKNOWN_LOCATION;
> -  if (c_parser_peek_token (parser)->type != CPP_OPEN_BRACE)
> -    body_loc = c_parser_peek_token (parser)->location;
> +  common_token_info body_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +
>    body = c_parser_c99_block_statement (parser);
> -  warn_for_misleading_indentation (for_loc, body_loc,
> -                                c_parser_peek_token (parser)->location,
> -                                c_parser_peek_token (parser)->type,
> -                                "for");
> +
> +  common_token_info next_tinfo
> +    = extract_token_info (c_parser_peek_token (parser));
> +  warn_for_misleading_indentation (for_tinfo, body_tinfo, next_tinfo);
>  
>    if (is_foreach_statement)
>      objc_finish_foreach_loop (loc, object_expression, collection_expression, 
> body, c_break_label, c_cont_label);
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 2ea61c0..a220f3e 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -2062,9 +2062,9 @@ static void cp_parser_declaration_statement
>    (cp_parser *);
>  
>  static tree cp_parser_implicitly_scoped_statement
> -  (cp_parser *, bool *, location_t, const char *);
> +  (cp_parser *, bool *, const common_token_info &);
>  static void cp_parser_already_scoped_statement
> -  (cp_parser *, location_t, const char *);
> +  (cp_parser *, const common_token_info &);
>  
>  /* Declarations [gram.dcl.dcl] */
>  
> @@ -10112,12 +10112,14 @@ cp_parser_selection_statement (cp_parser* parser, 
> bool *if_p)
>  {
>    cp_token *token;
>    enum rid keyword;
> +  common_token_info guard_tinfo;
>  
>    if (if_p != NULL)
>      *if_p = false;
>  
>    /* Peek at the next token.  */
>    token = cp_parser_require (parser, CPP_KEYWORD, RT_SELECT);
> +  guard_tinfo = extract_token_info (token);
>  
>    /* See what kind of keyword it is.  */
>    keyword = token->keyword;
> @@ -10160,19 +10162,8 @@ cp_parser_selection_statement (cp_parser* parser, 
> bool *if_p)
>           /* Parse the then-clause.  */
>           in_statement = parser->in_statement;
>           parser->in_statement |= IN_IF_STMT;
> -         if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
> -           {
> -             location_t loc = cp_lexer_peek_token (parser->lexer)->location;
> -             add_stmt (build_empty_stmt (loc));
> -             cp_lexer_consume_token (parser->lexer);
> -             if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_ELSE))
> -               warning_at (loc, OPT_Wempty_body, "suggest braces around "
> -                           "empty body in an %<if%> statement");
> -             nested_if = false;
> -           }
> -         else
> -           cp_parser_implicitly_scoped_statement (parser, &nested_if,
> -                                                  token->location, "if");
> +         cp_parser_implicitly_scoped_statement (parser, &nested_if,
> +                                                guard_tinfo);
>           parser->in_statement = in_statement;
>  
>           finish_then_clause (statement);
> @@ -10181,24 +10172,14 @@ cp_parser_selection_statement (cp_parser* parser, 
> bool *if_p)
>           if (cp_lexer_next_token_is_keyword (parser->lexer,
>                                               RID_ELSE))
>             {
> +             guard_tinfo
> +               = extract_token_info (cp_lexer_peek_token (parser->lexer));
>               /* Consume the `else' keyword.  */
> -             location_t else_tok_loc
> -               = cp_lexer_consume_token (parser->lexer)->location;
> +             cp_lexer_consume_token (parser->lexer);
>               begin_else_clause (statement);
>               /* Parse the else-clause.  */
> -             if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
> -               {
> -                 location_t loc;
> -                 loc = cp_lexer_peek_token (parser->lexer)->location;
> -                 warning_at (loc,
> -                             OPT_Wempty_body, "suggest braces around "
> -                             "empty body in an %<else%> statement");
> -                 add_stmt (build_empty_stmt (loc));
> -                 cp_lexer_consume_token (parser->lexer);
> -               }
> -             else
> -               cp_parser_implicitly_scoped_statement (parser, NULL,
> -                                                      else_tok_loc, "else");
> +             cp_parser_implicitly_scoped_statement (parser, NULL,
> +                                                    guard_tinfo);
>  
>               finish_else_clause (statement);
>  
> @@ -10239,7 +10220,7 @@ cp_parser_selection_statement (cp_parser* parser, 
> bool *if_p)
>           parser->in_switch_statement_p = true;
>           parser->in_statement |= IN_SWITCH_STMT;
>           cp_parser_implicitly_scoped_statement (parser, NULL,
> -                                                0, "switch");
> +                                                guard_tinfo);
>           parser->in_switch_statement_p = in_switch_statement_p;
>           parser->in_statement = in_statement;
>  
> @@ -10784,17 +10765,17 @@ static tree
>  cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
>  {
>    cp_token *token;
> -  location_t tok_loc;
>    enum rid keyword;
>    tree statement;
>    unsigned char in_statement;
> +  common_token_info guard_tinfo;
>  
>    /* Peek at the next token.  */
>    token = cp_parser_require (parser, CPP_KEYWORD, RT_INTERATION);
>    if (!token)
>      return error_mark_node;
>  
> -  tok_loc = token->location;
> +  guard_tinfo = extract_token_info (token);
>  
>    /* Remember whether or not we are already within an iteration
>       statement.  */
> @@ -10819,7 +10800,7 @@ cp_parser_iteration_statement (cp_parser* parser, 
> bool ivdep)
>       cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
>       /* Parse the dependent statement.  */
>       parser->in_statement = IN_ITERATION_STMT;
> -     cp_parser_already_scoped_statement (parser, tok_loc, "while");
> +     cp_parser_already_scoped_statement (parser, guard_tinfo);
>       parser->in_statement = in_statement;
>       /* We're done with the while-statement.  */
>       finish_while_stmt (statement);
> @@ -10834,7 +10815,7 @@ cp_parser_iteration_statement (cp_parser* parser, 
> bool ivdep)
>       statement = begin_do_stmt ();
>       /* Parse the body of the do-statement.  */
>       parser->in_statement = IN_ITERATION_STMT;
> -     cp_parser_implicitly_scoped_statement (parser, NULL, 0, "do");
> +     cp_parser_implicitly_scoped_statement (parser, NULL, guard_tinfo);
>       parser->in_statement = in_statement;
>       finish_do_body (statement);
>       /* Look for the `while' keyword.  */
> @@ -10864,7 +10845,7 @@ cp_parser_iteration_statement (cp_parser* parser, 
> bool ivdep)
>  
>       /* Parse the body of the for-statement.  */
>       parser->in_statement = IN_ITERATION_STMT;
> -     cp_parser_already_scoped_statement (parser, tok_loc, "for");
> +     cp_parser_already_scoped_statement (parser, guard_tinfo);
>       parser->in_statement = in_statement;
>  
>       /* We're done with the for-statement.  */
> @@ -11134,10 +11115,12 @@ cp_parser_declaration_statement (cp_parser* parser)
>  
>  static tree
>  cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p,
> -                                    location_t guard_loc,
> -                                    const char *guard_kind)
> +                                    const common_token_info &guard_tinfo)
>  {
>    tree statement;
> +  location_t body_loc = cp_lexer_peek_token (parser->lexer)->location;
> +  common_token_info body_tinfo
> +    = extract_token_info (cp_lexer_peek_token (parser->lexer));
>  
>    if (if_p != NULL)
>      *if_p = false;
> @@ -11145,9 +11128,16 @@ cp_parser_implicitly_scoped_statement (cp_parser* 
> parser, bool *if_p,
>    /* Mark if () ; with a special NOP_EXPR.  */
>    if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
>      {
> -      location_t loc = cp_lexer_peek_token (parser->lexer)->location;
>        cp_lexer_consume_token (parser->lexer);
> -      statement = add_stmt (build_empty_stmt (loc));
> +      statement = add_stmt (build_empty_stmt (body_loc));
> +
> +      if (guard_tinfo.keyword == RID_IF
> +       && !cp_lexer_next_token_is_keyword (parser->lexer, RID_ELSE))
> +     warning_at (body_loc, OPT_Wempty_body,
> +                 "suggest braces around empty body in an %<if%> statement");
> +      else if (guard_tinfo.keyword == RID_ELSE)
> +     warning_at (body_loc, OPT_Wempty_body,
> +                 "suggest braces around empty body in an %<else%> 
> statement");
>      }
>    /* if a compound is opened, we simply parse the statement directly.  */
>    else if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
> @@ -11158,20 +11148,15 @@ cp_parser_implicitly_scoped_statement (cp_parser* 
> parser, bool *if_p,
>        /* Create a compound-statement.  */
>        statement = begin_compound_stmt (0);
>        /* Parse the dependent-statement.  */
> -      location_t body_loc = cp_lexer_peek_token (parser->lexer)->location;
>        cp_parser_statement (parser, NULL_TREE, false, if_p);
>        /* Finish the dummy compound-statement.  */
>        finish_compound_stmt (statement);
> -      cp_token *next_tok = cp_lexer_peek_token (parser->lexer);
> -      if (next_tok->keyword != RID_ELSE)
> -        {
> -          location_t next_stmt_loc = next_tok->location;
> -          warn_for_misleading_indentation (guard_loc, body_loc,
> -                                           next_stmt_loc, next_tok->type,
> -                                           guard_kind);
> -        }
>      }
>  
> +  common_token_info next_tinfo
> +    = extract_token_info (cp_lexer_peek_token (parser->lexer));
> +  warn_for_misleading_indentation (guard_tinfo, body_tinfo, next_tinfo);
> +
>    /* Return the statement.  */
>    return statement;
>  }
> @@ -11182,19 +11167,19 @@ cp_parser_implicitly_scoped_statement (cp_parser* 
> parser, bool *if_p,
>     scope.  */
>  
>  static void
> -cp_parser_already_scoped_statement (cp_parser* parser, location_t guard_loc,
> -                                 const char *guard_kind)
> +cp_parser_already_scoped_statement (cp_parser* parser,
> +                                 const common_token_info &guard_tinfo)
>  {
>    /* If the token is a `{', then we must take special action.  */
>    if (cp_lexer_next_token_is_not (parser->lexer, CPP_OPEN_BRACE))
>      {
> -      location_t body_loc = cp_lexer_peek_token (parser->lexer)->location;
> +      common_token_info body_tinfo
> +     = extract_token_info (cp_lexer_peek_token (parser->lexer));
> +
>        cp_parser_statement (parser, NULL_TREE, false, NULL);
> -      cp_token *next_tok = cp_lexer_peek_token (parser->lexer);
> -      location_t next_stmt_loc = next_tok->location;
> -      warn_for_misleading_indentation (guard_loc, body_loc,
> -                                       next_stmt_loc, next_tok->type,
> -                                       guard_kind);
> +      common_token_info next_tinfo
> +     = extract_token_info (cp_lexer_peek_token (parser->lexer));
> +      warn_for_misleading_indentation (guard_tinfo, body_tinfo, next_tinfo);
>      }
>    else
>      {
> diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c 
> b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
> index 443e3dd..e8e04e3 100644
> --- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
> +++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
> @@ -691,3 +691,102 @@ fn_36 (void)
>       }
>       foo(6); /* We shouldn't warn here.  */
>  }
> +
> +void
> +fn_37 (void)
> +{
> +  int i;
> +
> +#define EMPTY
> +#define FOR_EACH(VAR, START, STOP) for (VAR = START; VAR < STOP; VAR++)
> +
> +  while (flagA); /* { dg-message "3: ...this 'while' clause" } */
> +    foo (0); /* { dg-warning "statement is indented as if" } */
> +
> +  if (flagA)
> +    ;
> +  else if (flagA); /* { dg-message "8: ...this 'if' clause" } */
> +    foo (0); /* { dg-warning "statement is indented as if" } */
> +
> +  while (flagA)
> +    ;
> +    foo (0);
> +
> +  if (flagA)
> +    ;
> +  else if (flagA) /* { dg-message "8: ...this 'if' clause" } */
> +    foo (0);
> +    foo (0); /* { dg-warning "statement is indented as if" } */
> +
> +  if (flagA)
> +    ;
> +
> +  if (flagB)
> +    ;
> +
> +  if (flagB)
> +    ;
> +    {
> +      foo (0);
> +    }
> +
> +  if (flagB)
> +    ;
> +  else; foo (0); /* { dg-warning "statement is indented as if" } */
> +
> +  if (flagC); foo (2); /* { dg-warning "statement is indented as if" } */
> +
> +  if (flagA)
> +    ; /* blah */ { /* { dg-warning "statement is indented as if" } */
> +      foo (1);
> +    }
> +
> +  if (flagB) ; /* { dg-message "3: ...this 'if' clause" } */
> +    return; /* { dg-warning "statement is indented as if" } */
> +
> +  if (flagB) EMPTY; /* { dg-message "3: ...this 'if' clause" } */
> +    foo (1); /* { dg-warning "statement is indented as if" } */
> +
> +  while (flagC);
> +  foo (2);
> +
> +  for (i = 0; i < 10; i++); /* { dg-message "3: ...this 'for' clause" } */
> +    foo (2); /* { dg-warning "statement is indented as if" } */
> +
> +  FOR_EACH (i, 0, 10);
> +    foo (2); /* { dg-warning "statement is indented as if" } */
> +
> +  FOR_EACH (i, 0, 10);
> +    { /* { dg-warning "statement is indented as if" } */
> +      foo (3);
> +    }
> +
> +  FOR_EACH (i, 0, 10);
> +  { /* { dg-warning "statement is indented as if" } */
> +    foo (3);
> +  }
> +
> +  if (flagA)
> +    while (flagC++);
> +  else
> +    ;
> +
> +  if (flagA)
> +    if (flagB)
> +      foo (0);
> +    else
> +      foo (0);
> +  else
> +    foo (0);
> +
> +  while (i++); { /* { dg-warning "statement is indented as if" } */
> +    foo (3);
> +  }
> +
> +  if (i++); { /* { dg-warning "statement is indented as if" } */
> +    foo (3);
> +  }
> +
> +#undef EMPTY
> +#undef FOR_EACH
> +}

Would it be good to separate out these into two categories (e.g. as two
functions): misleading indentation vs merely bad indentation, and have a
comment covering the latter?  (I like to have comments in my testcases,
though I may be in the minority here)

Thanks; hope the above is reasonable.
Dave

Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage

Reply via email to