https://gcc.gnu.org/g:d538348556bd05fba13e41cb8497bda2eb086e63

commit r16-5110-gd538348556bd05fba13e41cb8497bda2eb086e63
Author: Lewis Hyatt <[email protected]>
Date:   Wed Jul 30 19:20:55 2025 -0400

    libcpp: Improve locations for macros defined prior to PCH include [PR105608]
    
    It is permissible to define macros prior to including a PCH, as long as
    these definitions are disjoint from or identical to the macros in the
    PCH. The PCH loading process replaces all libcpp data structures with those
    from the PCH, so it is necessary to remember the extra macros separately and
    then restore them after loading the PCH, which all is handled by
    cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
    process consists of pushing a buffer containing the macro definition and
    then lexing it from there, similar to how a command-line -D option is
    processed. The current implementation does not attempt to set up the
    line_map for this process, and so the locations assigned to the macros are
    often not meaningful. (Similar to what happened in the past with lexing the
    tokens out of a _Pragma string, lexing out of a buffer rather than a file
    produces "sorta" reasonable locations that are often close enough, but not
    reliably correct.)
    
    Fix that up by remembering enough additional information (more or less, an
    expanded_location for each macro definition) to produce a reasonable
    location for the newly restored macros.
    
    One issue that came up is the treatment of command-line-defined macros. From
    the perspective of the generic line_map data structures, the command-line
    location is not distinguishable from other locations; it's just an ordinary
    location created by the front ends with a fake file name by convention. (At
    the moment, it is always the string `<command-line>', subject to
    translation.)  Since libcpp needs to assign macros to that location, it
    needs to know what location to use, so I added a new member
    line_maps::cmdline_location for the front ends to set, similar to how
    line_maps::builtin_location is handled.
    
    This revealed a small issue, in c-opts.cc we have:
    
        /* All command line defines must have the same location.  */
          cpp_force_token_locations (parse_in, line_table->highest_line);
    
    But contrary to the comment, all command line defines don't actually end up
    with the same location anymore. This is because libcpp/lex.cc has been
    expanded (r6-4873) to include range information on the returned
    locations. That logic has never been respecting the request of
    cpp_force_token_locations. I believe this was not intentional, and so I have
    corrected that here. Prior to this patch, the range logic has been leading
    to command-line macros all having similar locations in the same line map (or
    ad-hoc locations based from there for sufficiently long tokens); with this
    change, they all have exactly the same location and that location is
    recorded in line_maps::cmdline_location.
    
    With that change, then it works fine for pch.cc to restore macros whether
    they came from the command-line or from the main file.
    
    gcc/c-family/ChangeLog:
    
            PR preprocessor/105608
            * c-opts.cc (c_finish_options): Set new member
            line_table->cmdline_location.
            * c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
            libcpp pch.cc; it is now possible that the linemap is in a different
            file after returning from cpp_read_state().
    
    libcpp/ChangeLog:
    
            PR preprocessor/105608
            * include/line-map.h: Add new member CMDLINE_LOCATION.
            * lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
            the token location to include range information if token location
            override was requested.
            (warn_about_normalization): Likewise.
            (_cpp_lex_direct): Likewise.
            * pch.cc (struct saved_macro): New local struct.
            (struct save_macro_data): Change DEFNS vector to hold saved_macro
            rather than uchar*.
            (save_macros): Adapt to remember the location information for each
            saved macro in addition to the definition.
            (cpp_prepare_state): Likewise.
            (cpp_read_state): Use the saved location information to generate
            proper locations for the restored macros.
    
    gcc/testsuite/ChangeLog:
    
            PR preprocessor/105608
            * g++.dg/pch/line-map-3.C: Remove xfails.
            * g++.dg/pch/line-map-4.C: New test.
            * g++.dg/pch/line-map-4.Hs: New test.

Diff:
---
 gcc/c-family/c-opts.cc                 |   1 +
 gcc/c-family/c-pch.cc                  |  12 ++--
 gcc/testsuite/g++.dg/pch/line-map-3.C  |  26 ++-----
 gcc/testsuite/g++.dg/pch/line-map-4.C  |   5 ++
 gcc/testsuite/g++.dg/pch/line-map-4.Hs |   1 +
 libcpp/include/line-map.h              |   6 ++
 libcpp/lex.cc                          |   6 +-
 libcpp/pch.cc                          | 125 +++++++++++++++++++++++++++------
 8 files changed, 132 insertions(+), 50 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b7dc8ceae820..cb836b26a5ce 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1700,6 +1700,7 @@ c_finish_options (void)
       bool cxx_assert_seen_p = false;
 
       /* All command line defines must have the same location.  */
+      line_table->cmdline_location = line_table->highest_line;
       cpp_force_token_locations (parse_in, line_table->highest_line);
       for (size_t i = 0; i < deferred_count; i++)
        {
diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
index b8f075e29f70..8f0fea82cad6 100644
--- a/gcc/c-family/c-pch.cc
+++ b/gcc/c-family/c-pch.cc
@@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   rebuild_location_adhoc_htab (line_table);
   line_table->trace_includes = saved_trace_includes;
 
-  /* Set the current location to the line containing the #include (or the
-     #pragma GCC pch_preprocess) for the purpose of assigning locations to any
-     macros that are about to be restored.  */
-  linemap_add (line_table, LC_ENTER, 0, saved_loc.file,
-              saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line);
+  /* Set the line_map current location to the start of the file, so that things
+     remain in order after cpp_read_state() re-adds any macros that were 
defined
+     prior to calling gt_pch_restore().  */
+  linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0);
 
   timevar_push (TV_PCH_CPP_RESTORE);
   cpp_result = cpp_read_state (pfile, name, f, smd);
 
   /* Set the current location to the line following the #include, where we
      were prior to processing the PCH.  */
-  linemap_line_start (line_table, saved_loc.line, 0);
+  linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file,
+              saved_loc.line);
 
   timevar_pop (TV_PCH_CPP_RESTORE);
 
diff --git a/gcc/testsuite/g++.dg/pch/line-map-3.C 
b/gcc/testsuite/g++.dg/pch/line-map-3.C
index 3390d7adba21..9def08de9744 100644
--- a/gcc/testsuite/g++.dg/pch/line-map-3.C
+++ b/gcc/testsuite/g++.dg/pch/line-map-3.C
@@ -1,23 +1,5 @@
-#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */
-#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } } */
-
-/* { dg-do compile } */
-/* { dg-additional-options "-Werror=unused-macros" } */
-
 /* PR preprocessor/105608 */
-/* This test case is currently xfailed and requires work in libcpp/pch.cc to
-   resolve.  Currently, the macro location is incorrectly assigned to line 2
-   of the header file when read via PCH, because libcpp doesn't try to
-   assign locations relative to the newly loaded line map after restoring
-   the PCH.  */
-
-/* In PCH mode we also complain incorrectly about the command line macro 
-Dwith_PCH
-   added by dejagnu; that warning would get suppressed if the macro location 
were
-   correctly restored by libcpp to reflect that it was a command line macro.  
*/
-/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */
-
-/* The reason we used -Werror was to prevent pch.exp from rerunning without 
PCH;
-   in that case we would get unnecessary XPASS outputs since the test does work
-   fine without PCH.  Once the bug is fixed, remove the -Werror and switch to
-   dg-warning.  */
-/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
+/* { dg-do compile } */
+/* { dg-additional-options "-Wunused-macros" } */
+#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */
+#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */
diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.C 
b/gcc/testsuite/g++.dg/pch/line-map-4.C
new file mode 100644
index 000000000000..cc13a467ca14
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/line-map-4.C
@@ -0,0 +1,5 @@
+/* PR preprocessor/105608 */
+/* { dg-do compile } */
+#define INT int /* { dg-error "-:declaration does not declare anything" } */
+#include "line-map-4.H"
+INT; /* { dg-note "in expansion of macro" } */
diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.Hs 
b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
new file mode 100644
index 000000000000..3b6178bfae0f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/line-map-4.Hs
@@ -0,0 +1 @@
+/* This space intentionally left blank.  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 9bcf4d0c8078..2cd54d0770e4 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -878,6 +878,12 @@ public:
      built-in tokens.  */
   location_t builtin_location;
 
+  /* The special location value to be used for tokens originating on the
+     command line.  This is currently only needed by the C-family front ends
+     for PCH support; if it would be used for another purpose in the future,
+     then other libcpp-using front ends may need to set it as well.  */
+  location_t cmdline_location;
+
   /* The default value of range_bits in ordinary line maps.  */
   unsigned int default_range_bits;
 
diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index 2ba9d588708f..43d369108ab3 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader 
*pfile,
                                         const unsigned char *const start,
                                         size_t num_bytes)
 {
+  if (pfile->forced_token_location)
+    return pfile->forced_token_location;
   gcc_checking_assert (num_bytes > 0);
 
   /* CPP_BUF_COLUMN and linemap_position_for_column both refer
@@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile,
       /* If possible, create a location range for the token.  */
       if (loc >= RESERVED_LOCATION_COUNT
          && token->type != CPP_EOF
+         && !pfile->forced_token_location
          /* There must be no line notes to process.  */
          && (!(pfile->buffer->cur
                >= pfile->buffer->notes[pfile->buffer->cur_note].pos
@@ -4399,7 +4402,8 @@ _cpp_lex_direct (cpp_reader *pfile)
 
   /* Potentially convert the location of the token to a range.  */
   if (result->src_loc >= RESERVED_LOCATION_COUNT
-      && result->type != CPP_EOF)
+      && result->type != CPP_EOF
+      && !pfile->forced_token_location)
     {
       /* Ensure that any line notes are processed, so that we have the
         correct physical line/column for the end-point of the token even
diff --git a/libcpp/pch.cc b/libcpp/pch.cc
index fa5efc9f1025..3614ed22a289 100644
--- a/libcpp/pch.cc
+++ b/libcpp/pch.cc
@@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd)
 
 /* Save all the existing macros.  */
 
+namespace {
+
+struct saved_macro
+{
+  uchar *defn;
+  location_t loc;
+  expanded_location xloc;
+};
+
+} /* unnamed namespace */
+
 struct save_macro_data
 {
-  uchar **defns;
+  saved_macro *defns;
   size_t count;
   size_t array_size;
   char **saved_pragmas;
@@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void *data_p)
       if (data->count == data->array_size)
        {
          data->array_size *= 2;
-         data->defns = XRESIZEVEC (uchar *, data->defns, (data->array_size));
+         data->defns = XRESIZEVEC (saved_macro, data->defns, data->array_size);
        }
 
       const uchar * defn = cpp_macro_definition (r, h);
       size_t defnlen = ustrlen (defn);
 
-      data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen + 
2);
-      data->defns[data->count][defnlen] = '\n';
+      const auto d = data->defns + data->count;
+      d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2);
+      d->defn[defnlen] = '\n';
+      d->loc = h->value.macro->line;
+      d->xloc = {};
+      if (d->loc == r->line_table->cmdline_location)
+       {
+         /* The cmdline_location may be different when it comes time to
+            restore the macros, so use 0 to indicate this.  */
+         d->loc = 0;
+       }
+      else if (d->loc != r->line_table->builtin_location)
+       {
+         const auto map = linemap_lookup (r->line_table, d->loc);
+         gcc_assert (map);
+         d->xloc = linemap_expand_location (r->line_table, map, d->loc);
+       }
       data->count++;
     }
 
@@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct save_macro_data 
**data)
   struct save_macro_data *d = XNEW (struct save_macro_data);
 
   d->array_size = 512;
-  d->defns = XNEWVEC (uchar *, d->array_size);
+  d->defns = XNEWVEC (saved_macro, d->array_size);
   d->count = 0;
   cpp_forall_identifiers (r, save_macros, d);
+
+  /* Sort the saved macros in order, so the line map gets built up
+     in chronological order as expected when we load them back in.  */
+  static line_maps *line_table;
+  line_table = r->line_table;
+  qsort (d->defns, d->count, sizeof (saved_macro),
+        [] (const void *x1, const void *x2)
+        {
+          const auto d1 = static_cast<const saved_macro *> (x1);
+          const auto d2 = static_cast<const saved_macro *> (x2);
+          return linemap_compare_locations (line_table, d2->loc, d1->loc);
+        });
+
   d->saved_pragmas = _cpp_save_pragma_names (r);
   *data = d;
 }
@@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f,
   r->state.prevent_expansion = 1;
   r->state.angled_headers = 0;
 
+  const auto old_directive_line = r->directive_line;
+
   /* Run through the carefully-saved macros, insert them.  */
+  const char *cur_fname = nullptr;
   for (i = 0; i < data->count; i++)
     {
-      cpp_hashnode *h;
-      size_t namelen;
-      uchar *defn;
-
-      namelen = ustrcspn (data->defns[i], "( \n");
-      h = cpp_lookup (r, data->defns[i], namelen);
-      defn = data->defns[i] + namelen;
+      const auto d = data->defns + i;
+      const size_t namelen = ustrcspn (d->defn, "( \n");
+      cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen);
+      const auto defn = d->defn + namelen;
 
       /* The PCH file is valid, so we know that if there is a definition
         from the PCH file it must be the same as the one we had
@@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f,
          if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true)
              != NULL)
            {
-             _cpp_clean_line (r);
+             if (d->loc == r->line_table->builtin_location)
+               r->directive_line = r->line_table->builtin_location;
+             else if (d->loc == 0)
+               /* This was a command-line-defined macro, preserve that
+                  aspect now.  */
+               r->directive_line = r->line_table->cmdline_location;
+             else
+               {
+                 /* In practice, almost all of the locations ought to be in the
+                    main file, since a PCH cannot be loaded after including
+                    other headers.  The (probably sole) exception is that
+                    implicit preincludes can be loaded, so if an implicit
+                    preinclude such as /usr/include/stdc-predef.h has changed
+                    since the PCH was created (in a way not invalidating the
+                    PCH), we can end up here with locations that are not in the
+                    main file.  In that case, we may be adding them to the
+                    line_map out of order (the implicit preinclude already
+                    being represented in the line_map we have loaded from the
+                    PCH), but it is probably not going to cause any observable
+                    issue given the constraints on what can appear in a header
+                    before loading a PCH.  */
+                 if (!cur_fname || strcmp (cur_fname, d->xloc.file))
+                   {
+                     linemap_add (r->line_table, LC_RENAME, d->xloc.sysp,
+                                  d->xloc.file, d->xloc.line);
+                     cur_fname = d->xloc.file;
+                   }
+                 r->directive_line = linemap_line_start (r->line_table,
+                                                         d->xloc.line, 127);
+               }
+
+             /* Even when the macro was defined in the main file, we have to
+                use cpp_force_token_locations here, so all tokens in the macro
+                will have location assigned to the directive line with no
+                column information.  We no longer know even which line the
+                tokens appeared on originally now, much less the column, since
+                we only stored the macro definition string after lexing it.  */
+             cpp_force_token_locations (r, r->directive_line);
 
-             /* ??? Using r->line_table->highest_line is not ideal here, but we
-                do need to use some location that is relative to the new line
-                map just loaded, not the old one that was in effect when these
-                macros were lexed.  The proper fix is to remember the file name
-                and line number where each macro was defined, and then add
-                these locations into the new line map.  See PR105608.  */
-             if (!_cpp_create_definition (r, h, r->line_table->highest_line))
+             _cpp_clean_line (r);
+             if (!_cpp_create_definition (r, h, r->directive_line))
                abort ();
+             cpp_stop_forcing_token_locations (r);
              _cpp_pop_buffer (r);
+
+             if (d->loc < RESERVED_LOCATION_COUNT)
+               /* Macros defined on the command line or as builtins are always
+                  implicitly used.  It is necessary to note this here or else
+                  -Wunused-macros will complain, because these locations do
+                  return TRUE for MAIN_FILE_P().  For a command line
+                  definition, d->loc was set to 0 when we saved it, so it does
+                  satisfy the check performed here.  */
+               h->value.macro->used = true;
            }
          else
            abort ();
        }
 
-      free (data->defns[i]);
+      free (data->defns[i].defn);
     }
   r->state = old_state;
+  r->directive_line = old_directive_line;
 
   _cpp_restore_pragma_names (r, data->saved_pragmas);
 
+  XDELETEVEC (data->defns);
   free (data);
 
   if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : NULL)

Reply via email to