Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691235.html

Could I ping this one please? It would be great to close out this PR. Thanks!

-Lewis

On Sat, Aug 16, 2025 at 8:37 AM Lewis Hyatt <[email protected]> wrote:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691235.html
>
> Ping please? Thanks!
>
> -Lewis
>
> On Wed, Jul 30, 2025 at 7:21 PM Lewis Hyatt <[email protected]> wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> >
> > The PR was originally about an ICE in the linemap code that was patched up
> > for GCC 14. I left it open as a reminder to finish improving the locations
> > assigned to macros that are defined prior to including a PCH. This patch
> > provides that improvement... does it look OK please? Bootstrap + regtest all
> > languages on x86-64 and aarch64 (Linux). Thanks!
> >
> > -Lewis
> >
> > -- >8 --
> >
> > It is permissible to define macros prior to including a PCH, as long as
> > these definitions are disjoint from or identical to the macros in the
> > PCH. The PCH loading process replaces all libcpp data structures with those
> > from the PCH, so it is necessary to remember the extra macros separately and
> > then restore them after loading the PCH, which all is handled by
> > cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
> > process consists of pushing a buffer containing the macro definition and
> > then lexing it from there, similar to how a command-line -D option is
> > processed. The current implementation does not attempt to set up the
> > line_map for this process, and so the locations assigned to the macros are
> > often not meaningful. (Similar to what happened in the past with lexing the
> > tokens out of a _Pragma string, lexing out of a buffer rather than a file
> > produces "sorta" reasonable locations that are often close enough, but not
> > reliably correct.)
> >
> > Fix that up by remembering enough additional information (more or less, an
> > expanded_location for each macro definition) to produce a reasonable
> > location for the newly restored macros.
> >
> > One issue that came up is the treatment of command-line-defined macros. From
> > the perspective of the generic line_map data structures, the command-line
> > location is not distinguishable from other locations; it's just an ordinary
> > location created by the front ends with a fake file name by convention. (At
> > the moment, it is always the string `<command-line>', subject to
> > translation.)  Since libcpp needs to assign macros to that location, it
> > needs to know what location to use, so I added a new member
> > line_maps::cmdline_location for the front ends to set, similar to how
> > line_maps::builtin_location is handled.
> >
> > This revealed a small issue, in c-opts.cc we have:
> >
> >     /* All command line defines must have the same location.  */
> >       cpp_force_token_locations (parse_in, line_table->highest_line);
> >
> > But contrary to the comment, all command line defines don't actually end up
> > with the same location anymore. This is because libcpp/lex.cc has been
> > expanded (r6-4873) to include range information on the returned
> > locations. That logic has never been respecting the request of
> > cpp_force_token_locations. I believe this was not intentional, and so I have
> > corrected that here. Prior to this patch, the range logic has been leading
> > to command-line macros all having similar locations in the same line map (or
> > ad-hoc locations based from there for sufficiently long tokens); with this
> > change, they all have exactly the same location and that location is
> > recorded in line_maps::cmdline_location.
> >
> > With that change, then it works fine for pch.cc to restore macros whether
> > they came from the command-line or from the main file.
> >
> > gcc/c-family/ChangeLog:
> >
> >         PR preprocessor/105608
> >         * c-opts.cc (c_finish_options): Set new member
> >         line_table->cmdline_location.
> >         * c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
> >         libcpp pch.cc; it is now possible that the linemap is in a different
> >         file after returning from cpp_read_state().
> >
> > libcpp/ChangeLog:
> >
> >         PR preprocessor/105608
> >         * include/line-map.h: Add new member CMDLINE_LOCATION.
> >         * lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
> >         the token location to include range information if token location
> >         override was requested.
> >         (warn_about_normalization): Likewise.
> >         (_cpp_lex_direct): Likewise.
> >         * pch.cc (struct saved_macro): New local struct.
> >         (struct save_macro_data): Change DEFNS vector to hold saved_macro
> >         rather than uchar*.
> >         (save_macros): Adapt to remember the location information for each
> >         saved macro in addition to the definition.
> >         (cpp_prepare_state): Likewise.
> >         (cpp_read_state): Use the saved location information to generate
> >         proper locations for the restored macros.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR preprocessor/105608
> >         * g++.dg/pch/line-map-3.C: Remove xfails.
> >         * g++.dg/pch/line-map-4.C: New test.
> >         * g++.dg/pch/line-map-4.Hs: New test.
> > ---
> >  libcpp/include/line-map.h              |   4 +
> >  libcpp/lex.cc                          |   6 +-
> >  libcpp/pch.cc                          | 125 ++++++++++++++++++++-----
> >  gcc/c-family/c-opts.cc                 |   1 +
> >  gcc/c-family/c-pch.cc                  |  12 +--
> >  gcc/testsuite/g++.dg/pch/line-map-3.C  |  26 +----
> >  gcc/testsuite/g++.dg/pch/line-map-4.C  |   5 +
> >  gcc/testsuite/g++.dg/pch/line-map-4.Hs |   1 +
> >  8 files changed, 130 insertions(+), 50 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.Hs
> >
> > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> > index 21a59af2236..b4560cf3737 100644
> > --- a/libcpp/include/line-map.h
> > +++ b/libcpp/include/line-map.h
> > @@ -878,6 +878,10 @@ public:
> >       built-in tokens.  */
> >    location_t builtin_location;
> >
> > +  /* The special location value to be used for tokens originating on the
> > +     command line.  */
> > +  location_t cmdline_location;
> > +
> >    /* The default value of range_bits in ordinary line maps.  */
> >    unsigned int default_range_bits;
> >
> > diff --git a/libcpp/lex.cc b/libcpp/lex.cc
> > index e7705a64f39..2474d028cf8 100644
> > --- a/libcpp/lex.cc
> > +++ b/libcpp/lex.cc
> > @@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader 
> > *pfile,
> >                                          const unsigned char *const start,
> >                                          size_t num_bytes)
> >  {
> > +  if (pfile->forced_token_location)
> > +    return pfile->forced_token_location;
> >    gcc_checking_assert (num_bytes > 0);
> >
> >    /* CPP_BUF_COLUMN and linemap_position_for_column both refer
> > @@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile,
> >        /* If possible, create a location range for the token.  */
> >        if (loc >= RESERVED_LOCATION_COUNT
> >           && token->type != CPP_EOF
> > +         && !pfile->forced_token_location
> >           /* There must be no line notes to process.  */
> >           && (!(pfile->buffer->cur
> >                 >= pfile->buffer->notes[pfile->buffer->cur_note].pos
> > @@ -4401,7 +4404,8 @@ _cpp_lex_direct (cpp_reader *pfile)
> >
> >    /* Potentially convert the location of the token to a range.  */
> >    if (result->src_loc >= RESERVED_LOCATION_COUNT
> > -      && result->type != CPP_EOF)
> > +      && result->type != CPP_EOF
> > +      && !pfile->forced_token_location)
> >      {
> >        /* Ensure that any line notes are processed, so that we have the
> >          correct physical line/column for the end-point of the token even
> > diff --git a/libcpp/pch.cc b/libcpp/pch.cc
> > index fa5efc9f102..495fe7f434b 100644
> > --- a/libcpp/pch.cc
> > +++ b/libcpp/pch.cc
> > @@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int 
> > fd)
> >
> >  /* Save all the existing macros.  */
> >
> > +namespace {
> > +
> > +struct saved_macro
> > +{
> > +  uchar *defn;
> > +  location_t loc;
> > +  expanded_location xloc;
> > +};
> > +
> > +} /* unnamed namespace */
> > +
> >  struct save_macro_data
> >  {
> > -  uchar **defns;
> > +  saved_macro *defns;
> >    size_t count;
> >    size_t array_size;
> >    char **saved_pragmas;
> > @@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void 
> > *data_p)
> >        if (data->count == data->array_size)
> >         {
> >           data->array_size *= 2;
> > -         data->defns = XRESIZEVEC (uchar *, data->defns, 
> > (data->array_size));
> > +         data->defns = XRESIZEVEC (saved_macro, data->defns, 
> > data->array_size);
> >         }
> >
> >        const uchar * defn = cpp_macro_definition (r, h);
> >        size_t defnlen = ustrlen (defn);
> >
> > -      data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen 
> > + 2);
> > -      data->defns[data->count][defnlen] = '\n';
> > +      const auto d = data->defns + data->count;
> > +      d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2);
> > +      d->defn[defnlen] = '\n';
> > +      d->loc = h->value.macro->line;
> > +      d->xloc = {};
> > +      if (d->loc == r->line_table->cmdline_location)
> > +       {
> > +         /* The cmdline_location may be different when it comes time to
> > +            restore the macros, so use 0 to indicate this.  */
> > +         d->loc = 0;
> > +       }
> > +      else if (d->loc != r->line_table->builtin_location)
> > +       {
> > +         const auto map = linemap_lookup (r->line_table, d->loc);
> > +         gcc_assert (map);
> > +         d->xloc = linemap_expand_location (r->line_table, map, d->loc);
> > +       }
> >        data->count++;
> >      }
> >
> > @@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct 
> > save_macro_data **data)
> >    struct save_macro_data *d = XNEW (struct save_macro_data);
> >
> >    d->array_size = 512;
> > -  d->defns = XNEWVEC (uchar *, d->array_size);
> > +  d->defns = XNEWVEC (saved_macro, d->array_size);
> >    d->count = 0;
> >    cpp_forall_identifiers (r, save_macros, d);
> > +
> > +  /* Sort the saved macros in order, so the line map gets built up
> > +     in chronological order as expected when we load them back in.  */
> > +  static line_maps *line_table;
> > +  line_table = r->line_table;
> > +  qsort (d->defns, d->count, sizeof (saved_macro),
> > +        [] (const void *x1, const void *x2)
> > +        {
> > +          const auto d1 = static_cast<const saved_macro *> (x1);
> > +          const auto d2 = static_cast<const saved_macro *> (x2);
> > +          return -linemap_compare_locations (line_table, d1->loc, d2->loc);
> > +        });
> > +
> >    d->saved_pragmas = _cpp_save_pragma_names (r);
> >    *data = d;
> >  }
> > @@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE 
> > *f,
> >    r->state.prevent_expansion = 1;
> >    r->state.angled_headers = 0;
> >
> > +  const auto old_directive_line = r->directive_line;
> > +
> >    /* Run through the carefully-saved macros, insert them.  */
> > +  const char *cur_fname = nullptr;
> >    for (i = 0; i < data->count; i++)
> >      {
> > -      cpp_hashnode *h;
> > -      size_t namelen;
> > -      uchar *defn;
> > -
> > -      namelen = ustrcspn (data->defns[i], "( \n");
> > -      h = cpp_lookup (r, data->defns[i], namelen);
> > -      defn = data->defns[i] + namelen;
> > +      const auto d = data->defns + i;
> > +      const size_t namelen = ustrcspn (d->defn, "( \n");
> > +      cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen);
> > +      const auto defn = d->defn + namelen;
> >
> >        /* The PCH file is valid, so we know that if there is a definition
> >          from the PCH file it must be the same as the one we had
> > @@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE 
> > *f,
> >           if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true)
> >               != NULL)
> >             {
> > -             _cpp_clean_line (r);
> > +             if (d->loc == r->line_table->builtin_location)
> > +               r->directive_line = r->line_table->builtin_location;
> > +             else if (d->loc == 0)
> > +               /* This was a command-line-defined macro, preserve that
> > +                  aspect now.  */
> > +               r->directive_line = r->line_table->cmdline_location;
> > +             else
> > +               {
> > +                 /* In practice, almost all of the locations ought to be 
> > in the
> > +                    main file, since a PCH cannot be loaded after including
> > +                    other headers.  The (probably sole) exception is that
> > +                    implicit preincludes can be loaded, so if an implicit
> > +                    preinclude such as /usr/include/stdc-predef.h has 
> > changed
> > +                    since the PCH was created (in a way not invalidating 
> > the
> > +                    PCH), we can end up here with locations that are not 
> > in the
> > +                    main file.  In that case, we may be adding them to the
> > +                    line_map out of order (the implicit preinclude already
> > +                    being represented in the line_map we have loaded from 
> > the
> > +                    PCH), but it is probably not going to cause any 
> > observable
> > +                    issue given the constraints on what can appear in a 
> > header
> > +                    before loading a PCH.  */
> > +                 if (!cur_fname || strcmp (cur_fname, d->xloc.file))
> > +                   {
> > +                     linemap_add (r->line_table, LC_RENAME, d->xloc.sysp,
> > +                                  d->xloc.file, d->xloc.line);
> > +                     cur_fname = d->xloc.file;
> > +                   }
> > +                 r->directive_line = linemap_line_start (r->line_table,
> > +                                                         d->xloc.line, 
> > 127);
> > +               }
> > +
> > +             /* Even when the macro was defined in the main file, we have 
> > to
> > +                use cpp_force_token_locations here, so all tokens in the 
> > macro
> > +                will have location assigned to the directive line with no
> > +                column information.  We no longer know even which line the
> > +                tokens appeared on originally now, much less the column, 
> > since
> > +                we only stored the macro definition string after lexing 
> > it.  */
> > +             cpp_force_token_locations (r, r->directive_line);
> >
> > -             /* ??? Using r->line_table->highest_line is not ideal here, 
> > but we
> > -                do need to use some location that is relative to the new 
> > line
> > -                map just loaded, not the old one that was in effect when 
> > these
> > -                macros were lexed.  The proper fix is to remember the file 
> > name
> > -                and line number where each macro was defined, and then add
> > -                these locations into the new line map.  See PR105608.  */
> > -             if (!_cpp_create_definition (r, h, 
> > r->line_table->highest_line))
> > +             _cpp_clean_line (r);
> > +             if (!_cpp_create_definition (r, h, r->directive_line))
> >                 abort ();
> > +             cpp_stop_forcing_token_locations (r);
> >               _cpp_pop_buffer (r);
> > +
> > +             if (d->loc < RESERVED_LOCATION_COUNT)
> > +               /* Macros defined on the command line or as builtins are 
> > always
> > +                  implicitly used.  It is necessary to note this here or 
> > else
> > +                  -Wunused-macros will complain, because these locations do
> > +                  return TRUE for MAIN_FILE_P().  For a command line
> > +                  definition, d->loc was set to 0 when we saved it, so it 
> > does
> > +                  satisfy the check performed here.  */
> > +               h->value.macro->used = true;
> >             }
> >           else
> >             abort ();
> >         }
> >
> > -      free (data->defns[i]);
> > +      free (data->defns[i].defn);
> >      }
> >    r->state = old_state;
> > +  r->directive_line = old_directive_line;
> >
> >    _cpp_restore_pragma_names (r, data->saved_pragmas);
> >
> > +  XDELETEVEC (data->defns);
> >    free (data);
> >
> >    if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : 
> > NULL)
> > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> > index c652e82a8c7..f8011071103 100644
> > --- a/gcc/c-family/c-opts.cc
> > +++ b/gcc/c-family/c-opts.cc
> > @@ -1681,6 +1681,7 @@ c_finish_options (void)
> >        bool cxx_assert_seen_p = false;
> >
> >        /* All command line defines must have the same location.  */
> > +      line_table->cmdline_location = line_table->highest_line;
> >        cpp_force_token_locations (parse_in, line_table->highest_line);
> >        for (size_t i = 0; i < deferred_count; i++)
> >         {
> > diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
> > index b8f075e29f7..8f0fea82cad 100644
> > --- a/gcc/c-family/c-pch.cc
> > +++ b/gcc/c-family/c-pch.cc
> > @@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char 
> > *name,
> >    rebuild_location_adhoc_htab (line_table);
> >    line_table->trace_includes = saved_trace_includes;
> >
> > -  /* Set the current location to the line containing the #include (or the
> > -     #pragma GCC pch_preprocess) for the purpose of assigning locations to 
> > any
> > -     macros that are about to be restored.  */
> > -  linemap_add (line_table, LC_ENTER, 0, saved_loc.file,
> > -              saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line);
> > +  /* Set the line_map current location to the start of the file, so that 
> > things
> > +     remain in order after cpp_read_state() re-adds any macros that were 
> > defined
> > +     prior to calling gt_pch_restore().  */
> > +  linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0);
> >
> >    timevar_push (TV_PCH_CPP_RESTORE);
> >    cpp_result = cpp_read_state (pfile, name, f, smd);
> >
> >    /* Set the current location to the line following the #include, where we
> >       were prior to processing the PCH.  */
> > -  linemap_line_start (line_table, saved_loc.line, 0);
> > +  linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file,
> > +              saved_loc.line);
> >
> >    timevar_pop (TV_PCH_CPP_RESTORE);
> >
> > diff --git a/gcc/testsuite/g++.dg/pch/line-map-3.C 
> > b/gcc/testsuite/g++.dg/pch/line-map-3.C
> > index 3390d7adba2..9def08de974 100644
> > --- a/gcc/testsuite/g++.dg/pch/line-map-3.C
> > +++ b/gcc/testsuite/g++.dg/pch/line-map-3.C
> > @@ -1,23 +1,5 @@
> > -#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */
> > -#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } 
> > } */
> > -
> > -/* { dg-do compile } */
> > -/* { dg-additional-options "-Werror=unused-macros" } */
> > -
> >  /* PR preprocessor/105608 */
> > -/* This test case is currently xfailed and requires work in libcpp/pch.cc 
> > to
> > -   resolve.  Currently, the macro location is incorrectly assigned to line 
> > 2
> > -   of the header file when read via PCH, because libcpp doesn't try to
> > -   assign locations relative to the newly loaded line map after restoring
> > -   the PCH.  */
> > -
> > -/* In PCH mode we also complain incorrectly about the command line macro 
> > -Dwith_PCH
> > -   added by dejagnu; that warning would get suppressed if the macro 
> > location were
> > -   correctly restored by libcpp to reflect that it was a command line 
> > macro.  */
> > -/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */
> > -
> > -/* The reason we used -Werror was to prevent pch.exp from rerunning 
> > without PCH;
> > -   in that case we would get unnecessary XPASS outputs since the test does 
> > work
> > -   fine without PCH.  Once the bug is fixed, remove the -Werror and switch 
> > to
> > -   dg-warning.  */
> > -/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-Wunused-macros" } */
> > +#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */
> > +#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */
> > diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.C 
> > b/gcc/testsuite/g++.dg/pch/line-map-4.C
> > new file mode 100644
> > index 00000000000..cc13a467ca1
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/line-map-4.C
> > @@ -0,0 +1,5 @@
> > +/* PR preprocessor/105608 */
> > +/* { dg-do compile } */
> > +#define INT int /* { dg-error "-:declaration does not declare anything" } 
> > */
> > +#include "line-map-4.H"
> > +INT; /* { dg-note "in expansion of macro" } */
> > diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.Hs 
> > b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
> > new file mode 100644
> > index 00000000000..3b6178bfae0
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
> > @@ -0,0 +1 @@
> > +/* This space intentionally left blank.  */

Reply via email to