https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691235.html

Ping please? Thanks!

-Lewis

On Wed, Jul 30, 2025 at 7:21 PM Lewis Hyatt <lhy...@gmail.com> wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
>
> The PR was originally about an ICE in the linemap code that was patched up
> for GCC 14. I left it open as a reminder to finish improving the locations
> assigned to macros that are defined prior to including a PCH. This patch
> provides that improvement... does it look OK please? Bootstrap + regtest all
> languages on x86-64 and aarch64 (Linux). Thanks!
>
> -Lewis
>
> -- >8 --
>
> It is permissible to define macros prior to including a PCH, as long as
> these definitions are disjoint from or identical to the macros in the
> PCH. The PCH loading process replaces all libcpp data structures with those
> from the PCH, so it is necessary to remember the extra macros separately and
> then restore them after loading the PCH, which all is handled by
> cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
> process consists of pushing a buffer containing the macro definition and
> then lexing it from there, similar to how a command-line -D option is
> processed. The current implementation does not attempt to set up the
> line_map for this process, and so the locations assigned to the macros are
> often not meaningful. (Similar to what happened in the past with lexing the
> tokens out of a _Pragma string, lexing out of a buffer rather than a file
> produces "sorta" reasonable locations that are often close enough, but not
> reliably correct.)
>
> Fix that up by remembering enough additional information (more or less, an
> expanded_location for each macro definition) to produce a reasonable
> location for the newly restored macros.
>
> One issue that came up is the treatment of command-line-defined macros. From
> the perspective of the generic line_map data structures, the command-line
> location is not distinguishable from other locations; it's just an ordinary
> location created by the front ends with a fake file name by convention. (At
> the moment, it is always the string `<command-line>', subject to
> translation.)  Since libcpp needs to assign macros to that location, it
> needs to know what location to use, so I added a new member
> line_maps::cmdline_location for the front ends to set, similar to how
> line_maps::builtin_location is handled.
>
> This revealed a small issue, in c-opts.cc we have:
>
>     /* All command line defines must have the same location.  */
>       cpp_force_token_locations (parse_in, line_table->highest_line);
>
> But contrary to the comment, all command line defines don't actually end up
> with the same location anymore. This is because libcpp/lex.cc has been
> expanded (r6-4873) to include range information on the returned
> locations. That logic has never been respecting the request of
> cpp_force_token_locations. I believe this was not intentional, and so I have
> corrected that here. Prior to this patch, the range logic has been leading
> to command-line macros all having similar locations in the same line map (or
> ad-hoc locations based from there for sufficiently long tokens); with this
> change, they all have exactly the same location and that location is
> recorded in line_maps::cmdline_location.
>
> With that change, then it works fine for pch.cc to restore macros whether
> they came from the command-line or from the main file.
>
> gcc/c-family/ChangeLog:
>
>         PR preprocessor/105608
>         * c-opts.cc (c_finish_options): Set new member
>         line_table->cmdline_location.
>         * c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
>         libcpp pch.cc; it is now possible that the linemap is in a different
>         file after returning from cpp_read_state().
>
> libcpp/ChangeLog:
>
>         PR preprocessor/105608
>         * include/line-map.h: Add new member CMDLINE_LOCATION.
>         * lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
>         the token location to include range information if token location
>         override was requested.
>         (warn_about_normalization): Likewise.
>         (_cpp_lex_direct): Likewise.
>         * pch.cc (struct saved_macro): New local struct.
>         (struct save_macro_data): Change DEFNS vector to hold saved_macro
>         rather than uchar*.
>         (save_macros): Adapt to remember the location information for each
>         saved macro in addition to the definition.
>         (cpp_prepare_state): Likewise.
>         (cpp_read_state): Use the saved location information to generate
>         proper locations for the restored macros.
>
> gcc/testsuite/ChangeLog:
>
>         PR preprocessor/105608
>         * g++.dg/pch/line-map-3.C: Remove xfails.
>         * g++.dg/pch/line-map-4.C: New test.
>         * g++.dg/pch/line-map-4.Hs: New test.
> ---
>  libcpp/include/line-map.h              |   4 +
>  libcpp/lex.cc                          |   6 +-
>  libcpp/pch.cc                          | 125 ++++++++++++++++++++-----
>  gcc/c-family/c-opts.cc                 |   1 +
>  gcc/c-family/c-pch.cc                  |  12 +--
>  gcc/testsuite/g++.dg/pch/line-map-3.C  |  26 +----
>  gcc/testsuite/g++.dg/pch/line-map-4.C  |   5 +
>  gcc/testsuite/g++.dg/pch/line-map-4.Hs |   1 +
>  8 files changed, 130 insertions(+), 50 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.Hs
>
> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 21a59af2236..b4560cf3737 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -878,6 +878,10 @@ public:
>       built-in tokens.  */
>    location_t builtin_location;
>
> +  /* The special location value to be used for tokens originating on the
> +     command line.  */
> +  location_t cmdline_location;
> +
>    /* The default value of range_bits in ordinary line maps.  */
>    unsigned int default_range_bits;
>
> diff --git a/libcpp/lex.cc b/libcpp/lex.cc
> index e7705a64f39..2474d028cf8 100644
> --- a/libcpp/lex.cc
> +++ b/libcpp/lex.cc
> @@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader 
> *pfile,
>                                          const unsigned char *const start,
>                                          size_t num_bytes)
>  {
> +  if (pfile->forced_token_location)
> +    return pfile->forced_token_location;
>    gcc_checking_assert (num_bytes > 0);
>
>    /* CPP_BUF_COLUMN and linemap_position_for_column both refer
> @@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile,
>        /* If possible, create a location range for the token.  */
>        if (loc >= RESERVED_LOCATION_COUNT
>           && token->type != CPP_EOF
> +         && !pfile->forced_token_location
>           /* There must be no line notes to process.  */
>           && (!(pfile->buffer->cur
>                 >= pfile->buffer->notes[pfile->buffer->cur_note].pos
> @@ -4401,7 +4404,8 @@ _cpp_lex_direct (cpp_reader *pfile)
>
>    /* Potentially convert the location of the token to a range.  */
>    if (result->src_loc >= RESERVED_LOCATION_COUNT
> -      && result->type != CPP_EOF)
> +      && result->type != CPP_EOF
> +      && !pfile->forced_token_location)
>      {
>        /* Ensure that any line notes are processed, so that we have the
>          correct physical line/column for the end-point of the token even
> diff --git a/libcpp/pch.cc b/libcpp/pch.cc
> index fa5efc9f102..495fe7f434b 100644
> --- a/libcpp/pch.cc
> +++ b/libcpp/pch.cc
> @@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd)
>
>  /* Save all the existing macros.  */
>
> +namespace {
> +
> +struct saved_macro
> +{
> +  uchar *defn;
> +  location_t loc;
> +  expanded_location xloc;
> +};
> +
> +} /* unnamed namespace */
> +
>  struct save_macro_data
>  {
> -  uchar **defns;
> +  saved_macro *defns;
>    size_t count;
>    size_t array_size;
>    char **saved_pragmas;
> @@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void 
> *data_p)
>        if (data->count == data->array_size)
>         {
>           data->array_size *= 2;
> -         data->defns = XRESIZEVEC (uchar *, data->defns, (data->array_size));
> +         data->defns = XRESIZEVEC (saved_macro, data->defns, 
> data->array_size);
>         }
>
>        const uchar * defn = cpp_macro_definition (r, h);
>        size_t defnlen = ustrlen (defn);
>
> -      data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen + 
> 2);
> -      data->defns[data->count][defnlen] = '\n';
> +      const auto d = data->defns + data->count;
> +      d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2);
> +      d->defn[defnlen] = '\n';
> +      d->loc = h->value.macro->line;
> +      d->xloc = {};
> +      if (d->loc == r->line_table->cmdline_location)
> +       {
> +         /* The cmdline_location may be different when it comes time to
> +            restore the macros, so use 0 to indicate this.  */
> +         d->loc = 0;
> +       }
> +      else if (d->loc != r->line_table->builtin_location)
> +       {
> +         const auto map = linemap_lookup (r->line_table, d->loc);
> +         gcc_assert (map);
> +         d->xloc = linemap_expand_location (r->line_table, map, d->loc);
> +       }
>        data->count++;
>      }
>
> @@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct save_macro_data 
> **data)
>    struct save_macro_data *d = XNEW (struct save_macro_data);
>
>    d->array_size = 512;
> -  d->defns = XNEWVEC (uchar *, d->array_size);
> +  d->defns = XNEWVEC (saved_macro, d->array_size);
>    d->count = 0;
>    cpp_forall_identifiers (r, save_macros, d);
> +
> +  /* Sort the saved macros in order, so the line map gets built up
> +     in chronological order as expected when we load them back in.  */
> +  static line_maps *line_table;
> +  line_table = r->line_table;
> +  qsort (d->defns, d->count, sizeof (saved_macro),
> +        [] (const void *x1, const void *x2)
> +        {
> +          const auto d1 = static_cast<const saved_macro *> (x1);
> +          const auto d2 = static_cast<const saved_macro *> (x2);
> +          return -linemap_compare_locations (line_table, d1->loc, d2->loc);
> +        });
> +
>    d->saved_pragmas = _cpp_save_pragma_names (r);
>    *data = d;
>  }
> @@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE 
> *f,
>    r->state.prevent_expansion = 1;
>    r->state.angled_headers = 0;
>
> +  const auto old_directive_line = r->directive_line;
> +
>    /* Run through the carefully-saved macros, insert them.  */
> +  const char *cur_fname = nullptr;
>    for (i = 0; i < data->count; i++)
>      {
> -      cpp_hashnode *h;
> -      size_t namelen;
> -      uchar *defn;
> -
> -      namelen = ustrcspn (data->defns[i], "( \n");
> -      h = cpp_lookup (r, data->defns[i], namelen);
> -      defn = data->defns[i] + namelen;
> +      const auto d = data->defns + i;
> +      const size_t namelen = ustrcspn (d->defn, "( \n");
> +      cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen);
> +      const auto defn = d->defn + namelen;
>
>        /* The PCH file is valid, so we know that if there is a definition
>          from the PCH file it must be the same as the one we had
> @@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE 
> *f,
>           if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true)
>               != NULL)
>             {
> -             _cpp_clean_line (r);
> +             if (d->loc == r->line_table->builtin_location)
> +               r->directive_line = r->line_table->builtin_location;
> +             else if (d->loc == 0)
> +               /* This was a command-line-defined macro, preserve that
> +                  aspect now.  */
> +               r->directive_line = r->line_table->cmdline_location;
> +             else
> +               {
> +                 /* In practice, almost all of the locations ought to be in 
> the
> +                    main file, since a PCH cannot be loaded after including
> +                    other headers.  The (probably sole) exception is that
> +                    implicit preincludes can be loaded, so if an implicit
> +                    preinclude such as /usr/include/stdc-predef.h has changed
> +                    since the PCH was created (in a way not invalidating the
> +                    PCH), we can end up here with locations that are not in 
> the
> +                    main file.  In that case, we may be adding them to the
> +                    line_map out of order (the implicit preinclude already
> +                    being represented in the line_map we have loaded from the
> +                    PCH), but it is probably not going to cause any 
> observable
> +                    issue given the constraints on what can appear in a 
> header
> +                    before loading a PCH.  */
> +                 if (!cur_fname || strcmp (cur_fname, d->xloc.file))
> +                   {
> +                     linemap_add (r->line_table, LC_RENAME, d->xloc.sysp,
> +                                  d->xloc.file, d->xloc.line);
> +                     cur_fname = d->xloc.file;
> +                   }
> +                 r->directive_line = linemap_line_start (r->line_table,
> +                                                         d->xloc.line, 127);
> +               }
> +
> +             /* Even when the macro was defined in the main file, we have to
> +                use cpp_force_token_locations here, so all tokens in the 
> macro
> +                will have location assigned to the directive line with no
> +                column information.  We no longer know even which line the
> +                tokens appeared on originally now, much less the column, 
> since
> +                we only stored the macro definition string after lexing it.  
> */
> +             cpp_force_token_locations (r, r->directive_line);
>
> -             /* ??? Using r->line_table->highest_line is not ideal here, but 
> we
> -                do need to use some location that is relative to the new line
> -                map just loaded, not the old one that was in effect when 
> these
> -                macros were lexed.  The proper fix is to remember the file 
> name
> -                and line number where each macro was defined, and then add
> -                these locations into the new line map.  See PR105608.  */
> -             if (!_cpp_create_definition (r, h, r->line_table->highest_line))
> +             _cpp_clean_line (r);
> +             if (!_cpp_create_definition (r, h, r->directive_line))
>                 abort ();
> +             cpp_stop_forcing_token_locations (r);
>               _cpp_pop_buffer (r);
> +
> +             if (d->loc < RESERVED_LOCATION_COUNT)
> +               /* Macros defined on the command line or as builtins are 
> always
> +                  implicitly used.  It is necessary to note this here or else
> +                  -Wunused-macros will complain, because these locations do
> +                  return TRUE for MAIN_FILE_P().  For a command line
> +                  definition, d->loc was set to 0 when we saved it, so it 
> does
> +                  satisfy the check performed here.  */
> +               h->value.macro->used = true;
>             }
>           else
>             abort ();
>         }
>
> -      free (data->defns[i]);
> +      free (data->defns[i].defn);
>      }
>    r->state = old_state;
> +  r->directive_line = old_directive_line;
>
>    _cpp_restore_pragma_names (r, data->saved_pragmas);
>
> +  XDELETEVEC (data->defns);
>    free (data);
>
>    if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : 
> NULL)
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index c652e82a8c7..f8011071103 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -1681,6 +1681,7 @@ c_finish_options (void)
>        bool cxx_assert_seen_p = false;
>
>        /* All command line defines must have the same location.  */
> +      line_table->cmdline_location = line_table->highest_line;
>        cpp_force_token_locations (parse_in, line_table->highest_line);
>        for (size_t i = 0; i < deferred_count; i++)
>         {
> diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
> index b8f075e29f7..8f0fea82cad 100644
> --- a/gcc/c-family/c-pch.cc
> +++ b/gcc/c-family/c-pch.cc
> @@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
>    rebuild_location_adhoc_htab (line_table);
>    line_table->trace_includes = saved_trace_includes;
>
> -  /* Set the current location to the line containing the #include (or the
> -     #pragma GCC pch_preprocess) for the purpose of assigning locations to 
> any
> -     macros that are about to be restored.  */
> -  linemap_add (line_table, LC_ENTER, 0, saved_loc.file,
> -              saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line);
> +  /* Set the line_map current location to the start of the file, so that 
> things
> +     remain in order after cpp_read_state() re-adds any macros that were 
> defined
> +     prior to calling gt_pch_restore().  */
> +  linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0);
>
>    timevar_push (TV_PCH_CPP_RESTORE);
>    cpp_result = cpp_read_state (pfile, name, f, smd);
>
>    /* Set the current location to the line following the #include, where we
>       were prior to processing the PCH.  */
> -  linemap_line_start (line_table, saved_loc.line, 0);
> +  linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file,
> +              saved_loc.line);
>
>    timevar_pop (TV_PCH_CPP_RESTORE);
>
> diff --git a/gcc/testsuite/g++.dg/pch/line-map-3.C 
> b/gcc/testsuite/g++.dg/pch/line-map-3.C
> index 3390d7adba2..9def08de974 100644
> --- a/gcc/testsuite/g++.dg/pch/line-map-3.C
> +++ b/gcc/testsuite/g++.dg/pch/line-map-3.C
> @@ -1,23 +1,5 @@
> -#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */
> -#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } } 
> */
> -
> -/* { dg-do compile } */
> -/* { dg-additional-options "-Werror=unused-macros" } */
> -
>  /* PR preprocessor/105608 */
> -/* This test case is currently xfailed and requires work in libcpp/pch.cc to
> -   resolve.  Currently, the macro location is incorrectly assigned to line 2
> -   of the header file when read via PCH, because libcpp doesn't try to
> -   assign locations relative to the newly loaded line map after restoring
> -   the PCH.  */
> -
> -/* In PCH mode we also complain incorrectly about the command line macro 
> -Dwith_PCH
> -   added by dejagnu; that warning would get suppressed if the macro location 
> were
> -   correctly restored by libcpp to reflect that it was a command line macro. 
>  */
> -/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */
> -
> -/* The reason we used -Werror was to prevent pch.exp from rerunning without 
> PCH;
> -   in that case we would get unnecessary XPASS outputs since the test does 
> work
> -   fine without PCH.  Once the bug is fixed, remove the -Werror and switch to
> -   dg-warning.  */
> -/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wunused-macros" } */
> +#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */
> +#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */
> diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.C 
> b/gcc/testsuite/g++.dg/pch/line-map-4.C
> new file mode 100644
> index 00000000000..cc13a467ca1
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/line-map-4.C
> @@ -0,0 +1,5 @@
> +/* PR preprocessor/105608 */
> +/* { dg-do compile } */
> +#define INT int /* { dg-error "-:declaration does not declare anything" } */
> +#include "line-map-4.H"
> +INT; /* { dg-note "in expansion of macro" } */
> diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.Hs 
> b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
> new file mode 100644
> index 00000000000..3b6178bfae0
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
> @@ -0,0 +1 @@
> +/* This space intentionally left blank.  */

Reply via email to