https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691235.html
Ping please? Thanks! -Lewis On Wed, Jul 30, 2025 at 7:21 PM Lewis Hyatt <lhy...@gmail.com> wrote: > > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608 > > The PR was originally about an ICE in the linemap code that was patched up > for GCC 14. I left it open as a reminder to finish improving the locations > assigned to macros that are defined prior to including a PCH. This patch > provides that improvement... does it look OK please? Bootstrap + regtest all > languages on x86-64 and aarch64 (Linux). Thanks! > > -Lewis > > -- >8 -- > > It is permissible to define macros prior to including a PCH, as long as > these definitions are disjoint from or identical to the macros in the > PCH. The PCH loading process replaces all libcpp data structures with those > from the PCH, so it is necessary to remember the extra macros separately and > then restore them after loading the PCH, which all is handled by > cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration > process consists of pushing a buffer containing the macro definition and > then lexing it from there, similar to how a command-line -D option is > processed. The current implementation does not attempt to set up the > line_map for this process, and so the locations assigned to the macros are > often not meaningful. (Similar to what happened in the past with lexing the > tokens out of a _Pragma string, lexing out of a buffer rather than a file > produces "sorta" reasonable locations that are often close enough, but not > reliably correct.) > > Fix that up by remembering enough additional information (more or less, an > expanded_location for each macro definition) to produce a reasonable > location for the newly restored macros. > > One issue that came up is the treatment of command-line-defined macros. From > the perspective of the generic line_map data structures, the command-line > location is not distinguishable from other locations; it's just an ordinary > location created by the front ends with a fake file name by convention. (At > the moment, it is always the string `<command-line>', subject to > translation.) Since libcpp needs to assign macros to that location, it > needs to know what location to use, so I added a new member > line_maps::cmdline_location for the front ends to set, similar to how > line_maps::builtin_location is handled. > > This revealed a small issue, in c-opts.cc we have: > > /* All command line defines must have the same location. */ > cpp_force_token_locations (parse_in, line_table->highest_line); > > But contrary to the comment, all command line defines don't actually end up > with the same location anymore. This is because libcpp/lex.cc has been > expanded (r6-4873) to include range information on the returned > locations. That logic has never been respecting the request of > cpp_force_token_locations. I believe this was not intentional, and so I have > corrected that here. Prior to this patch, the range logic has been leading > to command-line macros all having similar locations in the same line map (or > ad-hoc locations based from there for sufficiently long tokens); with this > change, they all have exactly the same location and that location is > recorded in line_maps::cmdline_location. > > With that change, then it works fine for pch.cc to restore macros whether > they came from the command-line or from the main file. > > gcc/c-family/ChangeLog: > > PR preprocessor/105608 > * c-opts.cc (c_finish_options): Set new member > line_table->cmdline_location. > * c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in > libcpp pch.cc; it is now possible that the linemap is in a different > file after returning from cpp_read_state(). > > libcpp/ChangeLog: > > PR preprocessor/105608 > * include/line-map.h: Add new member CMDLINE_LOCATION. > * lex.cc (get_location_for_byte_range_in_cur_line): Do not expand > the token location to include range information if token location > override was requested. > (warn_about_normalization): Likewise. > (_cpp_lex_direct): Likewise. > * pch.cc (struct saved_macro): New local struct. > (struct save_macro_data): Change DEFNS vector to hold saved_macro > rather than uchar*. > (save_macros): Adapt to remember the location information for each > saved macro in addition to the definition. > (cpp_prepare_state): Likewise. > (cpp_read_state): Use the saved location information to generate > proper locations for the restored macros. > > gcc/testsuite/ChangeLog: > > PR preprocessor/105608 > * g++.dg/pch/line-map-3.C: Remove xfails. > * g++.dg/pch/line-map-4.C: New test. > * g++.dg/pch/line-map-4.Hs: New test. > --- > libcpp/include/line-map.h | 4 + > libcpp/lex.cc | 6 +- > libcpp/pch.cc | 125 ++++++++++++++++++++----- > gcc/c-family/c-opts.cc | 1 + > gcc/c-family/c-pch.cc | 12 +-- > gcc/testsuite/g++.dg/pch/line-map-3.C | 26 +---- > gcc/testsuite/g++.dg/pch/line-map-4.C | 5 + > gcc/testsuite/g++.dg/pch/line-map-4.Hs | 1 + > 8 files changed, 130 insertions(+), 50 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.C > create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.Hs > > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h > index 21a59af2236..b4560cf3737 100644 > --- a/libcpp/include/line-map.h > +++ b/libcpp/include/line-map.h > @@ -878,6 +878,10 @@ public: > built-in tokens. */ > location_t builtin_location; > > + /* The special location value to be used for tokens originating on the > + command line. */ > + location_t cmdline_location; > + > /* The default value of range_bits in ordinary line maps. */ > unsigned int default_range_bits; > > diff --git a/libcpp/lex.cc b/libcpp/lex.cc > index e7705a64f39..2474d028cf8 100644 > --- a/libcpp/lex.cc > +++ b/libcpp/lex.cc > @@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader > *pfile, > const unsigned char *const start, > size_t num_bytes) > { > + if (pfile->forced_token_location) > + return pfile->forced_token_location; > gcc_checking_assert (num_bytes > 0); > > /* CPP_BUF_COLUMN and linemap_position_for_column both refer > @@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile, > /* If possible, create a location range for the token. */ > if (loc >= RESERVED_LOCATION_COUNT > && token->type != CPP_EOF > + && !pfile->forced_token_location > /* There must be no line notes to process. */ > && (!(pfile->buffer->cur > >= pfile->buffer->notes[pfile->buffer->cur_note].pos > @@ -4401,7 +4404,8 @@ _cpp_lex_direct (cpp_reader *pfile) > > /* Potentially convert the location of the token to a range. */ > if (result->src_loc >= RESERVED_LOCATION_COUNT > - && result->type != CPP_EOF) > + && result->type != CPP_EOF > + && !pfile->forced_token_location) > { > /* Ensure that any line notes are processed, so that we have the > correct physical line/column for the end-point of the token even > diff --git a/libcpp/pch.cc b/libcpp/pch.cc > index fa5efc9f102..495fe7f434b 100644 > --- a/libcpp/pch.cc > +++ b/libcpp/pch.cc > @@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd) > > /* Save all the existing macros. */ > > +namespace { > + > +struct saved_macro > +{ > + uchar *defn; > + location_t loc; > + expanded_location xloc; > +}; > + > +} /* unnamed namespace */ > + > struct save_macro_data > { > - uchar **defns; > + saved_macro *defns; > size_t count; > size_t array_size; > char **saved_pragmas; > @@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void > *data_p) > if (data->count == data->array_size) > { > data->array_size *= 2; > - data->defns = XRESIZEVEC (uchar *, data->defns, (data->array_size)); > + data->defns = XRESIZEVEC (saved_macro, data->defns, > data->array_size); > } > > const uchar * defn = cpp_macro_definition (r, h); > size_t defnlen = ustrlen (defn); > > - data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen + > 2); > - data->defns[data->count][defnlen] = '\n'; > + const auto d = data->defns + data->count; > + d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2); > + d->defn[defnlen] = '\n'; > + d->loc = h->value.macro->line; > + d->xloc = {}; > + if (d->loc == r->line_table->cmdline_location) > + { > + /* The cmdline_location may be different when it comes time to > + restore the macros, so use 0 to indicate this. */ > + d->loc = 0; > + } > + else if (d->loc != r->line_table->builtin_location) > + { > + const auto map = linemap_lookup (r->line_table, d->loc); > + gcc_assert (map); > + d->xloc = linemap_expand_location (r->line_table, map, d->loc); > + } > data->count++; > } > > @@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct save_macro_data > **data) > struct save_macro_data *d = XNEW (struct save_macro_data); > > d->array_size = 512; > - d->defns = XNEWVEC (uchar *, d->array_size); > + d->defns = XNEWVEC (saved_macro, d->array_size); > d->count = 0; > cpp_forall_identifiers (r, save_macros, d); > + > + /* Sort the saved macros in order, so the line map gets built up > + in chronological order as expected when we load them back in. */ > + static line_maps *line_table; > + line_table = r->line_table; > + qsort (d->defns, d->count, sizeof (saved_macro), > + [] (const void *x1, const void *x2) > + { > + const auto d1 = static_cast<const saved_macro *> (x1); > + const auto d2 = static_cast<const saved_macro *> (x2); > + return -linemap_compare_locations (line_table, d1->loc, d2->loc); > + }); > + > d->saved_pragmas = _cpp_save_pragma_names (r); > *data = d; > } > @@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE > *f, > r->state.prevent_expansion = 1; > r->state.angled_headers = 0; > > + const auto old_directive_line = r->directive_line; > + > /* Run through the carefully-saved macros, insert them. */ > + const char *cur_fname = nullptr; > for (i = 0; i < data->count; i++) > { > - cpp_hashnode *h; > - size_t namelen; > - uchar *defn; > - > - namelen = ustrcspn (data->defns[i], "( \n"); > - h = cpp_lookup (r, data->defns[i], namelen); > - defn = data->defns[i] + namelen; > + const auto d = data->defns + i; > + const size_t namelen = ustrcspn (d->defn, "( \n"); > + cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen); > + const auto defn = d->defn + namelen; > > /* The PCH file is valid, so we know that if there is a definition > from the PCH file it must be the same as the one we had > @@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE > *f, > if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true) > != NULL) > { > - _cpp_clean_line (r); > + if (d->loc == r->line_table->builtin_location) > + r->directive_line = r->line_table->builtin_location; > + else if (d->loc == 0) > + /* This was a command-line-defined macro, preserve that > + aspect now. */ > + r->directive_line = r->line_table->cmdline_location; > + else > + { > + /* In practice, almost all of the locations ought to be in > the > + main file, since a PCH cannot be loaded after including > + other headers. The (probably sole) exception is that > + implicit preincludes can be loaded, so if an implicit > + preinclude such as /usr/include/stdc-predef.h has changed > + since the PCH was created (in a way not invalidating the > + PCH), we can end up here with locations that are not in > the > + main file. In that case, we may be adding them to the > + line_map out of order (the implicit preinclude already > + being represented in the line_map we have loaded from the > + PCH), but it is probably not going to cause any > observable > + issue given the constraints on what can appear in a > header > + before loading a PCH. */ > + if (!cur_fname || strcmp (cur_fname, d->xloc.file)) > + { > + linemap_add (r->line_table, LC_RENAME, d->xloc.sysp, > + d->xloc.file, d->xloc.line); > + cur_fname = d->xloc.file; > + } > + r->directive_line = linemap_line_start (r->line_table, > + d->xloc.line, 127); > + } > + > + /* Even when the macro was defined in the main file, we have to > + use cpp_force_token_locations here, so all tokens in the > macro > + will have location assigned to the directive line with no > + column information. We no longer know even which line the > + tokens appeared on originally now, much less the column, > since > + we only stored the macro definition string after lexing it. > */ > + cpp_force_token_locations (r, r->directive_line); > > - /* ??? Using r->line_table->highest_line is not ideal here, but > we > - do need to use some location that is relative to the new line > - map just loaded, not the old one that was in effect when > these > - macros were lexed. The proper fix is to remember the file > name > - and line number where each macro was defined, and then add > - these locations into the new line map. See PR105608. */ > - if (!_cpp_create_definition (r, h, r->line_table->highest_line)) > + _cpp_clean_line (r); > + if (!_cpp_create_definition (r, h, r->directive_line)) > abort (); > + cpp_stop_forcing_token_locations (r); > _cpp_pop_buffer (r); > + > + if (d->loc < RESERVED_LOCATION_COUNT) > + /* Macros defined on the command line or as builtins are > always > + implicitly used. It is necessary to note this here or else > + -Wunused-macros will complain, because these locations do > + return TRUE for MAIN_FILE_P(). For a command line > + definition, d->loc was set to 0 when we saved it, so it > does > + satisfy the check performed here. */ > + h->value.macro->used = true; > } > else > abort (); > } > > - free (data->defns[i]); > + free (data->defns[i].defn); > } > r->state = old_state; > + r->directive_line = old_directive_line; > > _cpp_restore_pragma_names (r, data->saved_pragmas); > > + XDELETEVEC (data->defns); > free (data); > > if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : > NULL) > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc > index c652e82a8c7..f8011071103 100644 > --- a/gcc/c-family/c-opts.cc > +++ b/gcc/c-family/c-opts.cc > @@ -1681,6 +1681,7 @@ c_finish_options (void) > bool cxx_assert_seen_p = false; > > /* All command line defines must have the same location. */ > + line_table->cmdline_location = line_table->highest_line; > cpp_force_token_locations (parse_in, line_table->highest_line); > for (size_t i = 0; i < deferred_count; i++) > { > diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc > index b8f075e29f7..8f0fea82cad 100644 > --- a/gcc/c-family/c-pch.cc > +++ b/gcc/c-family/c-pch.cc > @@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char *name, > rebuild_location_adhoc_htab (line_table); > line_table->trace_includes = saved_trace_includes; > > - /* Set the current location to the line containing the #include (or the > - #pragma GCC pch_preprocess) for the purpose of assigning locations to > any > - macros that are about to be restored. */ > - linemap_add (line_table, LC_ENTER, 0, saved_loc.file, > - saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line); > + /* Set the line_map current location to the start of the file, so that > things > + remain in order after cpp_read_state() re-adds any macros that were > defined > + prior to calling gt_pch_restore(). */ > + linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0); > > timevar_push (TV_PCH_CPP_RESTORE); > cpp_result = cpp_read_state (pfile, name, f, smd); > > /* Set the current location to the line following the #include, where we > were prior to processing the PCH. */ > - linemap_line_start (line_table, saved_loc.line, 0); > + linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file, > + saved_loc.line); > > timevar_pop (TV_PCH_CPP_RESTORE); > > diff --git a/gcc/testsuite/g++.dg/pch/line-map-3.C > b/gcc/testsuite/g++.dg/pch/line-map-3.C > index 3390d7adba2..9def08de974 100644 > --- a/gcc/testsuite/g++.dg/pch/line-map-3.C > +++ b/gcc/testsuite/g++.dg/pch/line-map-3.C > @@ -1,23 +1,5 @@ > -#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */ > -#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } } > */ > - > -/* { dg-do compile } */ > -/* { dg-additional-options "-Werror=unused-macros" } */ > - > /* PR preprocessor/105608 */ > -/* This test case is currently xfailed and requires work in libcpp/pch.cc to > - resolve. Currently, the macro location is incorrectly assigned to line 2 > - of the header file when read via PCH, because libcpp doesn't try to > - assign locations relative to the newly loaded line map after restoring > - the PCH. */ > - > -/* In PCH mode we also complain incorrectly about the command line macro > -Dwith_PCH > - added by dejagnu; that warning would get suppressed if the macro location > were > - correctly restored by libcpp to reflect that it was a command line macro. > */ > -/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */ > - > -/* The reason we used -Werror was to prevent pch.exp from rerunning without > PCH; > - in that case we would get unnecessary XPASS outputs since the test does > work > - fine without PCH. Once the bug is fixed, remove the -Werror and switch to > - dg-warning. */ > -/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */ > +/* { dg-do compile } */ > +/* { dg-additional-options "-Wunused-macros" } */ > +#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */ > +#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */ > diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.C > b/gcc/testsuite/g++.dg/pch/line-map-4.C > new file mode 100644 > index 00000000000..cc13a467ca1 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pch/line-map-4.C > @@ -0,0 +1,5 @@ > +/* PR preprocessor/105608 */ > +/* { dg-do compile } */ > +#define INT int /* { dg-error "-:declaration does not declare anything" } */ > +#include "line-map-4.H" > +INT; /* { dg-note "in expansion of macro" } */ > diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.Hs > b/gcc/testsuite/g++.dg/pch/line-map-4.Hs > new file mode 100644 > index 00000000000..3b6178bfae0 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pch/line-map-4.Hs > @@ -0,0 +1 @@ > +/* This space intentionally left blank. */