[PATCH] libcpp: Fix _Pragma("GCC system_header") [PR114436]

2024-03-23 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114436

This is a small fix for the issue mentioned in the PR that _Pragma("GCC
system_header") does not work completely. I believe it was always the case
since _Pragma() support was first added. bootstrap + regtested all languages
on x86-64 Linux. Is it OK now or in stage 1 please? Thanks!

-Lewis

-- >8 --

_Pragma("GCC system_header") currently takes effect only partially. It does
succeed in updating the line_map, so that checks like in_system_header_at()
return correctly, but it does not update pfile->buffer->sysp.  One result is
that a subsequent #include does not set up the system header state properly
for the newly included file, as pointed out in the PR. Fix by propagating
the new system header state back to the buffer after processing the pragma.

libcpp/ChangeLog:

PR preprocessor/114436
* directives.cc (destringize_and_run): If the _Pragma changed the
buffer system header state (e.g. because it was _Pragma("GCC
system_header"), propagate that change back to the actual buffer too.

gcc/testsuite/ChangeLog:

PR preprocessor/114436
* c-c++-common/cpp/pragma-system-header-1.h: New test.
* c-c++-common/cpp/pragma-system-header-2.h: New test.
* c-c++-common/cpp/pragma-system-header.c: New test.
---
 libcpp/directives.cc  | 11 ---
 .../c-c++-common/cpp/pragma-system-header-1.h |  1 +
 .../c-c++-common/cpp/pragma-system-header-2.h |  5 +
 gcc/testsuite/c-c++-common/cpp/pragma-system-header.c |  3 +++
 4 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-system-header-1.h
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-system-header-2.h
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-system-header.c

diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 479f8c716e8..bbaf9db6af0 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -1919,9 +1919,11 @@ destringize_and_run (cpp_reader *pfile, const cpp_string 
*in,
  until we've read all of the tokens that we want.  */
   cpp_push_buffer (pfile, (const uchar *) result, dest - result,
   /* from_stage3 */ true);
-  /* ??? Antique Disgusting Hack.  What does this do?  */
-  if (pfile->buffer->prev)
-pfile->buffer->file = pfile->buffer->prev->file;
+
+  /* This is needed for _Pragma("once") and _Pragma("GCC system_header") to 
work
+ properly.  */
+  pfile->buffer->file = pfile->buffer->prev->file;
+  pfile->buffer->sysp = pfile->buffer->prev->sysp;
 
   start_directive (pfile);
   _cpp_clean_line (pfile);
@@ -1986,6 +1988,9 @@ destringize_and_run (cpp_reader *pfile, const cpp_string 
*in,
 
   /* Finish inlining run_directive.  */
   pfile->buffer->file = NULL;
+  /* If the system header state changed due to #pragma GCC system_header, then
+ make that applicable to the real buffer too.  */
+  pfile->buffer->prev->sysp = pfile->buffer->sysp;
   _cpp_pop_buffer (pfile);
 
   /* Reset the old macro state before ...  */
diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-system-header-1.h 
b/gcc/testsuite/c-c++-common/cpp/pragma-system-header-1.h
new file mode 100644
index 000..bd9ff0cb138
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pragma-system-header-1.h
@@ -0,0 +1 @@
+#pragma GCC warning "this warning should not be output (1)"
diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-system-header-2.h 
b/gcc/testsuite/c-c++-common/cpp/pragma-system-header-2.h
new file mode 100644
index 000..a62d9e2685a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pragma-system-header-2.h
@@ -0,0 +1,5 @@
+_Pragma("GCC system_header")
+#include "pragma-system-header-1.h"
+#pragma GCC warning "this warning should not be output (2)"
+_Pragma("unknown")
+#include "pragma-system-header-1.h"
diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-system-header.c 
b/gcc/testsuite/c-c++-common/cpp/pragma-system-header.c
new file mode 100644
index 000..fdea12009e1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pragma-system-header.c
@@ -0,0 +1,3 @@
+#include "pragma-system-header-2.h" /* { dg-bogus "this warning should not be 
output" } */
+/* { dg-do preprocess } */
+/* PR preprocessor/114436 */


ping: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2024-03-19 Thread Lewis Hyatt
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

Thanks!

On Fri, Feb 16, 2024 at 7:02 PM Lewis Hyatt  wrote:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
>
> On Thu, Jan 25, 2024 at 4:57 PM Lewis Hyatt  wrote:
> >
> > May I please ask again about this one? It's just a couple lines, and I
> > think it fixes an important gap in the logic for #pragma GCC
> > diagnostic. The PR was not reported by me so I think at least one
> > other person does care about it :). Thanks!
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> >
> > -Lewis
> >
> > On Mon, Jan 8, 2024 at 6:53 PM Lewis Hyatt  wrote:
> > >
> > > Can I please ping this one again? It's 3 lines or so to fix the PR. 
> > > Thanks!
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> > >
> > > On Tue, Dec 19, 2023 at 6:20 PM Lewis Hyatt  wrote:
> > > >
> > > > Hello-
> > > >
> > > > May I please ping this one? Thanks...
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> > > >
> > > > -Lewis
> > > >
> > > > On Wed, Nov 29, 2023 at 7:05 PM Lewis Hyatt  wrote:
> > > > >
> > > > > On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> > > > > >
> > > > > > This patch fixes the behavior of `#pragma GCC diagnostic pop' for 
> > > > > > permissive
> > > > > > error diagnostics such as -Wnarrowing (in C++11). Those currently 
> > > > > > do not
> > > > > > return to the correct state after the last pop; they become 
> > > > > > effectively
> > > > > > simple warnings instead. Bootstrap + regtest all languages on 
> > > > > > x86-64, does
> > > > > > it look OK please? Thanks!
> > > > >
> > > > > Hello-
> > > > >
> > > > > May I please ping this bug fix?
> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html
> > > > >
> > > > > Please note, it requires a trivial rebase on top of recent changes to
> > > > > the class diagnostic_context public interface. I attached the rebased 
> > > > > patch
> > > > > here as well. Thanks!
> > > > >
> > > > > -Lewis


Re: ping: [PATCH] libcpp: Fix __has_include_next ICE in the last directory of the path [PR80755]

2024-02-27 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html

There was a request on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755#c5) for me to ping
this again, so I am complying :). Might anyone have a minute to take a
look please? Thanks...


-Lewis


On Thu, Jan 11, 2024 at 7:34 AM Lewis Hyatt  wrote:
>
> Can I please ping this one? Thanks...
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
>
> -Lewis
>
> On Thu, Dec 21, 2023 at 7:37 AM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755
> >
> > Here is a short fix for the ICE in libcpp noted in the PR. Bootstrap +
> > regtest all languages on x86-64 Linux. Is it OK please? Thanks!
> >
> > -Lewis
> >
> > -- >8 --
> >
> > In libcpp/files.cc, the function _cpp_has_header(), which implements
> > __has_include and __has_include_next, does not check for a NULL return value
> > from search_path_head(), leading to an ICE tripping an assert when
> > _cpp_find_file() tries to use it. Fix it by checking for that case and
> > silently returning false instead.
> >
> > As suggested by the PR author, it is easiest to make a testcase by using
> > the -idirafter option. To enable that, also modify the dg-additional-options
> > testsuite procedure to make the global $srcdir available, since -idirafter
> > requires the full path.
> >
> > libcpp/ChangeLog:
> >
> > PR preprocessor/80755
> > * files.cc (search_path_head): Add SUPPRESS_DIAGNOSTIC argument
> > defaulting to false.
> > (_cpp_has_header): Silently return false if the search path has been
> > exhausted, rather than issuing a diagnostic and then hitting an
> > assert.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * lib/gcc-defs.exp (dg-additional-options): Make $srcdir usable in a
> > dg-additional-options directive.
> > * c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h: New 
> > test.
> > * c-c++-common/cpp/has-include-next-2.c: New test.
> > ---
> >  libcpp/files.cc  | 12 
> >  .../cpp/has-include-next-2-dir/has-include-next-2.h  |  3 +++
> >  gcc/testsuite/c-c++-common/cpp/has-include-next-2.c  |  4 
> >  gcc/testsuite/lib/gcc-defs.exp   |  1 +
> >  4 files changed, 16 insertions(+), 4 deletions(-)
> >  create mode 100644 
> > gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
> >  create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
> >
> > diff --git a/libcpp/files.cc b/libcpp/files.cc
> > index 27301d79fa4..aaab4b13a6a 100644
> > --- a/libcpp/files.cc
> > +++ b/libcpp/files.cc
> > @@ -181,7 +181,8 @@ static bool read_file_guts (cpp_reader *pfile, 
> > _cpp_file *file,
> >  static bool read_file (cpp_reader *pfile, _cpp_file *file,
> >location_t loc);
> >  static struct cpp_dir *search_path_head (cpp_reader *, const char *fname,
> > -int angle_brackets, enum include_type);
> > +int angle_brackets, enum 
> > include_type,
> > +bool suppress_diagnostic = false);
> >  static const char *dir_name_of_file (_cpp_file *file);
> >  static void open_file_failed (cpp_reader *pfile, _cpp_file *file, int,
> >   location_t);
> > @@ -1041,7 +1042,7 @@ _cpp_mark_file_once_only (cpp_reader *pfile, 
> > _cpp_file *file)
> > nothing left in the path, returns NULL.  */
> >  static struct cpp_dir *
> >  search_path_head (cpp_reader *pfile, const char *fname, int angle_brackets,
> > - enum include_type type)
> > + enum include_type type, bool suppress_diagnostic)
> >  {
> >cpp_dir *dir;
> >_cpp_file *file;
> > @@ -1070,7 +1071,7 @@ search_path_head (cpp_reader *pfile, const char 
> > *fname, int angle_brackets,
> >  return make_cpp_dir (pfile, dir_name_of_file (file),
> >  pfile->buffer ? pfile->buffer->sysp : 0);
> >
> > -  if (dir == NULL)
> > +  if (dir == NULL && !suppress_diagnostic)
> >  cpp_error (pfile, CPP_DL_ERROR,
> >"no include path in which to search for %s", fname);
> >
> > @@ -2164,7 +2165,10 @@ bool
> >  _cpp_has_header (cpp_reader *pfile, const char *fname, int angle_brackets,
> >  

Re: [PATCH] libcpp: Improve location for macro names [PR66290]

2024-02-22 Thread Lewis Hyatt
On Thu, Feb 22, 2024 at 3:56 AM Richard Biener
 wrote:
>
> On Tue, Feb 20, 2024 at 3:33 PM Lewis Hyatt  wrote:
> >
> > On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva  wrote:
> > >
> > > This backport for gcc-13 is the first of two required for the
> > > g++.dg/pch/line-map-3.C test to stop hitting a variant of the known
> > > problem mentioned in that testcase: on riscv64-elf and riscv32-elf,
> > > after restoring the PCH, the location of the macros is mentioned as if
> > > they were on line 3 rather than 2, so even the existing xfails fail.  I
> > > think this might be too much to backport, and I'm ready to use an xfail
> > > instead, but since this would bring more predictability, I thought I'd
> > > ask whether you'd find this backport acceptable.
> > >
> > > Regstrapped on x86_64-linux-gnu, along with other backports, and tested
> > > manually on riscv64-elf.  Ok to install?
> >
> > Sorry that test is causing a problem, I hadn't realized at first that
> > the wrong output was target-dependent. I feel like simply deleting
> > this test g++.dg/pch/line-map-3.C from GCC 13 branch should be fine
> > too, as a safer alternative, if release managers prefer?
>
> Yes please.
>
> Richard.

Committed that removal as r13-8353.

-Lewis


Re: [PATCH] libcpp: Improve location for macro names [PR66290]

2024-02-20 Thread Lewis Hyatt
On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva  wrote:
>
> This backport for gcc-13 is the first of two required for the
> g++.dg/pch/line-map-3.C test to stop hitting a variant of the known
> problem mentioned in that testcase: on riscv64-elf and riscv32-elf,
> after restoring the PCH, the location of the macros is mentioned as if
> they were on line 3 rather than 2, so even the existing xfails fail.  I
> think this might be too much to backport, and I'm ready to use an xfail
> instead, but since this would bring more predictability, I thought I'd
> ask whether you'd find this backport acceptable.
>
> Regstrapped on x86_64-linux-gnu, along with other backports, and tested
> manually on riscv64-elf.  Ok to install?

Sorry that test is causing a problem, I hadn't realized at first that
the wrong output was target-dependent. I feel like simply deleting
this test g++.dg/pch/line-map-3.C from GCC 13 branch should be fine
too, as a safer alternative, if release managers prefer? It doesn't
really need to be on the branch, it's only purpose is to remind me to
fix the underlying issue for GCC 15...

-Lewis


ping: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2024-02-16 Thread Lewis Hyatt
CCing some global reviewers as well, in case anyone has a minute to
take a look please? Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

On Thu, Jan 25, 2024 at 4:57 PM Lewis Hyatt  wrote:
>
> May I please ask again about this one? It's just a couple lines, and I
> think it fixes an important gap in the logic for #pragma GCC
> diagnostic. The PR was not reported by me so I think at least one
> other person does care about it :). Thanks!
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
>
> -Lewis
>
> On Mon, Jan 8, 2024 at 6:53 PM Lewis Hyatt  wrote:
> >
> > Can I please ping this one again? It's 3 lines or so to fix the PR. Thanks!
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> >
> > On Tue, Dec 19, 2023 at 6:20 PM Lewis Hyatt  wrote:
> > >
> > > Hello-
> > >
> > > May I please ping this one? Thanks...
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> > >
> > > -Lewis
> > >
> > > On Wed, Nov 29, 2023 at 7:05 PM Lewis Hyatt  wrote:
> > > >
> > > > On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> > > > >
> > > > > This patch fixes the behavior of `#pragma GCC diagnostic pop' for 
> > > > > permissive
> > > > > error diagnostics such as -Wnarrowing (in C++11). Those currently do 
> > > > > not
> > > > > return to the correct state after the last pop; they become 
> > > > > effectively
> > > > > simple warnings instead. Bootstrap + regtest all languages on x86-64, 
> > > > > does
> > > > > it look OK please? Thanks!
> > > >
> > > > Hello-
> > > >
> > > > May I please ping this bug fix?
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html
> > > >
> > > > Please note, it requires a trivial rebase on top of recent changes to
> > > > the class diagnostic_context public interface. I attached the rebased 
> > > > patch
> > > > here as well. Thanks!
> > > >
> > > > -Lewis


ping: [PATCH] libcpp: Support extended characters for #pragma {push,pop}_macro [PR109704]

2024-02-10 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html

May I please ping this one? Thanks!

On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704
>
> The below patch fixes the issue noted in the PR that extended characters
> cannot appear in the identifier passed to a #pragma push_macro or #pragma
> pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for
> GCC 13 please?
>
> I know we just entered stage 4, however I feel this is kinda like an old
> regression, given that the issue was not apparent until support for UCNs and
> UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into
> GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this
> release. (The other major one was for extended characters in a user-defined
> literal, that was fixed by r14-2629).
>
> Speaking of just entering stage 4. I do have 4 really short patches sent
> over the past several months that never got any response. Is there any
> chance someone may have a few minutes to look at them please? They are
> really just like 1-3 line fixes for PRs.
>
> libcpp (pinged once recently):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html
>
> diagnostics (pinged for 3rd time last week):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

> -- >8 --
>
> The implementation of #pragma push_macro and #pragma pop_macro has to date
> made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
> identifier out of a string. When support was added for extended characters
> in identifiers ($, UCNs, or UTF-8), that support was added only for the
> "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
> not for the ad-hoc way. Consequently, extended identifiers are not usable
> with these pragmas.
>
> The logic for lexing identifiers has become more complicated than it was
> when _cpp_lex_identifier() was written -- it now handles things like \N{}
> escapes in C++, for instance -- and it no longer seems practical to maintain
> a redundant code path for lexing identifiers. Address the issue by changing
> the implementation of #pragma {push,pop}_macro to lex identifiers in the
> expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
> there.
>
> The existing implementation has some quirks because of the ad-hoc parsing
> logic. For example:
>
>  #pragma push_macro("X ")
>  ...
>  #pragma pop_macro("X")
>
> will not restore macro X (note the extra space in the first string). However:
>
>  #pragma push_macro("X ")
>  ...
>  #pragma pop_macro("X ")
>
> actually does sucessfully restore "X". This is because the key for looking
> up the saved macro on the push stack is the original string passed, so the
> string passed to pop_macro needs to match it exactly. It is not that easy to
> reproduce this logic in the world of extended characters, given that for
> example it should be valid to pass a UCN to push_macro, and the
> corresponding UTF-8 to pop_macro. Given that this aspect of the existing
> behavior seems unintentional and has no tests (and does not match other
> implementations), I opted to make the new logic more straightforward. The
> string passed needs to lex to one token, which must be a valid identifier,
> or else no action is taken and no error is generated. Any diagnostics
> encountered during lexing (e.g., due to a UTF-8 character not permitted to
> appear in an identifier) are also suppressed.
>
> It could be nice (for GCC 15) to also add a warning if a pop_macro does not
> match a previous push_macro.
>
> libcpp/ChangeLog:
>
> PR preprocessor/109704
> * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
> * errors.cc
> (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
> function.
> (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
> function.
> * charset.cc (noop_diagnostic_cb): Remove.
> (cpp_interpret_string_ranges): Refactor diagnostic suppression logic
> into new class cpp_auto_suppress_diagnostics.
> (count_source_chars): Likewise.
> * directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
> (lex_identifier_from_string): New static helper function.
> (push_pop_macro_common): Refactor common logic from
> do_pragma_push_macro and do_pragma_pop_macro; use
> lex_identifier_from_string instead of _cpp_lex_identifier.
> (d

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-02-01 Thread Lewis Hyatt
On Thu, Feb 1, 2024 at 7:24 AM Rainer Orth  
wrote:
>
> Hi Lewis,
>
> > On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:
> >> On 12/5/23 20:52, Lewis Hyatt wrote:
> >> > Hello-
> >> >
> >> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> >> >
> >> > There are two related issues here really, a regression since GCC 11 
> >> > where we
> >> > can ICE after restoring a PCH, and a deeper issue with bogus locations
> >> > assigned to macros that were defined prior to restoring a PCH.  This 
> >> > patch
> >> > fixes the ICE regression with a simple change, and I think it's 
> >> > appropriate
> >> > for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, 
> >> > but
> >> > not generally causing an ICE, and mostly affecting only the output of
> >> > -Wunused-macros) are not as problematic, and will be harder to fix. I 
> >> > could
> >> > take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> >> > tests for the wrong locations (as well as passing tests for the 
> >> > regression
> >> > fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
> >> > Linux. Thanks!
> >>
> >> OK for trunk and branches, thanks!
> >>
> >
> > Thanks for the review! That is all taken care of. I have one more request if
> > you don't mind please... There have been some further comments on the PR
> > indicating that the new xfailed testcase I added is failing in an unexpected
> > way on at least one architecture. To recap, the idea here was that
> >
> > 1) libcpp needs new logic to be able to output correct locations for this
> > case. That will be some new code that is suitable for stage 1, not now.
> >
> > 2) In the meantime, we fixed things up enough to avoid an ICE that showed up
> > in GCC 11, and added an xfailed testcase to remind about #1.
> >
> > The problem is that, the reason that libcpp outputs the wrong locations, is
> > that it has always used a location from the old line_map instance to index
> > into the new line_map instance, and so the exact details of the wrong
> > locations it outputs depend on the state of those two line maps, which may
> > differ depending on system includes and things like that. So I was hoping to
> > make one further one-line change to libcpp, not yet to output correct
> > locations, but at least to output one which is the same always and doesn't
> > depend on random things. This would assign all restored macros to a
> > consistent location, one line following the #include that triggered the PCH
> > process. I think this probably shouldn't be backported but it would be nice
> > to get into GCC 14, while nothing critical, at least it would avoid the new
> > test failure that's being reported. But more generally, I think using a
> > location from a totally different line map is dangerous and could have worse
> > consequences that haven't been seen yet. Does it look OK please? Thanks!
>
> FWIW, I've tested this (the initial) version of this patch on
> sparc-sun-solaris2.11 (PASSes as before) and i386-pc-solaris2.11 (PASSes
> now unlike before).
>
> Thanks.
> Rainer

Thanks a lot! And sorry for that issue. I will push the updated
version of that patch shortly.


Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-31 Thread Lewis Hyatt
On Wed, Jan 31, 2024 at 03:18:01PM -0500, Jason Merrill wrote:
> On 1/30/24 21:49, Lewis Hyatt wrote:
> > On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:
> > > On 12/5/23 20:52, Lewis Hyatt wrote:
> > > > Hello-
> > > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> > > > 
> > > > There are two related issues here really, a regression since GCC 11 
> > > > where we
> > > > can ICE after restoring a PCH, and a deeper issue with bogus locations
> > > > assigned to macros that were defined prior to restoring a PCH.  This 
> > > > patch
> > > > fixes the ICE regression with a simple change, and I think it's 
> > > > appropriate
> > > > for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, 
> > > > but
> > > > not generally causing an ICE, and mostly affecting only the output of
> > > > -Wunused-macros) are not as problematic, and will be harder to fix. I 
> > > > could
> > > > take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> > > > tests for the wrong locations (as well as passing tests for the 
> > > > regression
> > > > fix). Does it look OK please? Bootstrap + regtest all languages on 
> > > > x86-64
> > > > Linux. Thanks!
> > > 
> > > OK for trunk and branches, thanks!
> > > 
> > 
> > Thanks for the review! That is all taken care of. I have one more request if
> > you don't mind please... There have been some further comments on the PR
> > indicating that the new xfailed testcase I added is failing in an unexpected
> > way on at least one architecture. To recap, the idea here was that
> > 
> > 1) libcpp needs new logic to be able to output correct locations for this
> > case. That will be some new code that is suitable for stage 1, not now.
> > 
> > 2) In the meantime, we fixed things up enough to avoid an ICE that showed up
> > in GCC 11, and added an xfailed testcase to remind about #1.
> > 
> > The problem is that, the reason that libcpp outputs the wrong locations, is
> > that it has always used a location from the old line_map instance to index
> > into the new line_map instance, and so the exact details of the wrong
> > locations it outputs depend on the state of those two line maps, which may
> > differ depending on system includes and things like that. So I was hoping to
> > make one further one-line change to libcpp, not yet to output correct
> > locations, but at least to output one which is the same always and doesn't
> > depend on random things. This would assign all restored macros to a
> > consistent location, one line following the #include that triggered the PCH
> > process. I think this probably shouldn't be backported but it would be nice
> > to get into GCC 14, while nothing critical, at least it would avoid the new
> > test failure that's being reported. But more generally, I think using a
> > location from a totally different line map is dangerous and could have worse
> > consequences that haven't been seen yet. Does it look OK please? Thanks!
> 
> Can we use the line of the #include, as the test expects, rather than the
> following line?

Thanks, yes, that will work too, it just needs a few changes to
c-family/c-pch.cc to set the location there and then increment it
after. Patch which does that is attached. (This is a new one based on
master, not incremental to the prior patch.) The testcase does not require
any changes this way, and bootstrap + regtest looks good.

-Lewis
[PATCH] c-family: Stabilize the location for macros restored after PCH load 
[PR105608]

libcpp currently lacks the infrastructure to assign correct locations to
macros that were defined prior to loading a PCH and then restored
afterwards. While I plan to address that fully for GCC 15, this patch
improves things by using at least a valid location, even if it's not the
best one. Without this change, libcpp uses pfile->directive_line as the
location for the restored macros, but this location_t applies to the old
line map, not the one that was just restored from the PCH, so the resulting
location is unpredictable and depends on what was stored in the line maps
before. With this change, all restored macros get assigned locations at the
line of the #include that triggered the PCH restore. A future patch will
store the actual file name and line number of each definition and then
synthesize locations in the new line map pointing to the right place.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Adjust line map so that libcpp

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-30 Thread Lewis Hyatt
On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:
> On 12/5/23 20:52, Lewis Hyatt wrote:
> > Hello-
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> > 
> > There are two related issues here really, a regression since GCC 11 where we
> > can ICE after restoring a PCH, and a deeper issue with bogus locations
> > assigned to macros that were defined prior to restoring a PCH.  This patch
> > fixes the ICE regression with a simple change, and I think it's appropriate
> > for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
> > not generally causing an ICE, and mostly affecting only the output of
> > -Wunused-macros) are not as problematic, and will be harder to fix. I could
> > take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> > tests for the wrong locations (as well as passing tests for the regression
> > fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
> > Linux. Thanks!
> 
> OK for trunk and branches, thanks!
>

Thanks for the review! That is all taken care of. I have one more request if
you don't mind please... There have been some further comments on the PR
indicating that the new xfailed testcase I added is failing in an unexpected
way on at least one architecture. To recap, the idea here was that

1) libcpp needs new logic to be able to output correct locations for this
case. That will be some new code that is suitable for stage 1, not now.

2) In the meantime, we fixed things up enough to avoid an ICE that showed up
in GCC 11, and added an xfailed testcase to remind about #1.

The problem is that, the reason that libcpp outputs the wrong locations, is
that it has always used a location from the old line_map instance to index
into the new line_map instance, and so the exact details of the wrong
locations it outputs depend on the state of those two line maps, which may
differ depending on system includes and things like that. So I was hoping to
make one further one-line change to libcpp, not yet to output correct
locations, but at least to output one which is the same always and doesn't
depend on random things. This would assign all restored macros to a
consistent location, one line following the #include that triggered the PCH
process. I think this probably shouldn't be backported but it would be nice
to get into GCC 14, while nothing critical, at least it would avoid the new
test failure that's being reported. But more generally, I think using a
location from a totally different line map is dangerous and could have worse
consequences that haven't been seen yet. Does it look OK please? Thanks!

-Lewis
[PATCH] libcpp: Stabilize the location for macros restored after PCH [PR105608]

libcpp currently lacks the infrastructure to assign correct locations to
macros that were defined prior to loading a PCH and then restored
afterwards. While I plan to address that for GCC 15, this one-line patch
improves things by using at least a valid location, even if it's not the
best one. Without this change, libcpp uses pfile->directive_line as the
location for the restored macros, but this location_t applies to the old
line map, not the one that was just restored from the PCH, so the resulting
location is unpredictable and depends on what was stored in the line maps
before. With this change, all restored macros get assigned locations at
line_table->highest_line, which is the first line after the #include that
triggered the PCH restore. A future patch will store the actual file name
and line number of the definition and then synthesize locations in the new
line map pointing to the right place.

libcpp/ChangeLog:

PR preprocessor/105608
* pch.cc (cpp_read_state): Set a valid location for restored
macros.

gcc/testsuite/ChangeLog:

PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Adjust to expect the new location.

diff --git a/libcpp/pch.cc b/libcpp/pch.cc
index e156fe257b3..f2f74ed6ea9 100644
--- a/libcpp/pch.cc
+++ b/libcpp/pch.cc
@@ -838,7 +838,14 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f,
  != NULL)
{
  _cpp_clean_line (r);
- if (!_cpp_create_definition (r, h, 0))
+
+ /* ??? Using r->line_table->highest_line is not ideal here, but we
+do need to use some location that is relative to the new line
+map just loaded, not the old one that was in effect when these
+macros were lexed.  The proper fix is to remember the file name
+and line number where each macro was defined, and then add
+these locations into the new line map.  See PR105608.  */
+ if (!_cpp_create_definition (r, h, r->line_table->highest_line))
abort ();
  _cpp_pop_buffer (r);
}
diff --git a/gcc/te

ping: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-25 Thread Lewis Hyatt
Hello-

May I please ping this small patch? Thanks
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html

-Lewis

On Wed, Dec 20, 2023 at 8:02 PM Lewis Hyatt  wrote:
>
> Hello-
>
> May I please ping this PCH patch? Thanks!
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html
>
> -Lewis
>
> On Tue, Dec 5, 2023 at 8:52 PM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> >
> > There are two related issues here really, a regression since GCC 11 where we
> > can ICE after restoring a PCH, and a deeper issue with bogus locations
> > assigned to macros that were defined prior to restoring a PCH.  This patch
> > fixes the ICE regression with a simple change, and I think it's appropriate
> > for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
> > not generally causing an ICE, and mostly affecting only the output of
> > -Wunused-macros) are not as problematic, and will be harder to fix. I could
> > take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> > tests for the wrong locations (as well as passing tests for the regression
> > fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
> > Linux. Thanks!
> >
> > -Lewis
> >
> > -- >8 --
> >
> > Users are allowed to define macros prior to restoring a precompiled header
> > file, as long as those macros are not defined (or are defined identically)
> > in the PCH.  However, the PCH restoration process destroys all the macro
> > definitions, so libcpp has to record them before restoring the PCH and then
> > redefine them afterward.
> >
> > This process does not currently assign great locations to the macros after
> > redefining them. Some work is needed to also remember the original locations
> > and get the line_maps instance in the right state (since, like all other
> > data structures, the line_maps instance is also reset after restoring a 
> > PCH).
> > The new testcase line-map-3.C contains XFAILed examples where the locations
> > are wrong.
> >
> > This patch addresses a more pressing issue, which is that we ICE in some
> > cases since GCC 11, hitting an assert in line-maps.cc. It happens if the
> > first line encountered after the PCH restore requires an LC_RENAME map, such
> > as will happen if the line is sufficiently long.  This is much easier to
> > fix, since we just need to call linemap_line_start before asking libcpp to
> > redefine the stored macros, instead of afterward, to avoid the unexpected
> > need for an LC_RENAME before an LC_ENTER has been seen.
> >
> > gcc/c-family/ChangeLog:
> >
> > PR preprocessor/105608
> > * c-pch.cc (c_common_read_pch): Start a new line map before asking
> > libcpp to restore macros defined prior to reading the PCH, instead
> > of afterward.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR preprocessor/105608
> > * g++.dg/pch/line-map-1.C: New test.
> > * g++.dg/pch/line-map-1.Hs: New test.
> > * g++.dg/pch/line-map-2.C: New test.
> > * g++.dg/pch/line-map-2.Hs: New test.
> > * g++.dg/pch/line-map-3.C: New test.
> > * g++.dg/pch/line-map-3.Hs: New test.
> > ---
> >  gcc/c-family/c-pch.cc  |  5 ++---
> >  gcc/testsuite/g++.dg/pch/line-map-1.C  |  4 
> >  gcc/testsuite/g++.dg/pch/line-map-1.Hs |  1 +
> >  gcc/testsuite/g++.dg/pch/line-map-2.C  |  6 ++
> >  gcc/testsuite/g++.dg/pch/line-map-2.Hs |  1 +
> >  gcc/testsuite/g++.dg/pch/line-map-3.C  | 23 +++
> >  gcc/testsuite/g++.dg/pch/line-map-3.Hs |  1 +
> >  7 files changed, 38 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.Hs
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.Hs
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.Hs
> >
> > diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
> > index 2f014fca210..9ee6f179002 100644
> > --- a/gcc/c-family/c-pch.cc
> > +++ b/gcc/c-family/c-pch.cc
> > @@ -342,6 +342,8 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
> >gt_pch_restore (f);
> >cpp_set_line_map (pfile, line_table);
> >rebuild_location_adhoc_htab (line_table);
> > +  line_table->trace_includes = 

ping: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2024-01-25 Thread Lewis Hyatt
May I please ask again about this one? It's just a couple lines, and I
think it fixes an important gap in the logic for #pragma GCC
diagnostic. The PR was not reported by me so I think at least one
other person does care about it :). Thanks!

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

-Lewis

On Mon, Jan 8, 2024 at 6:53 PM Lewis Hyatt  wrote:
>
> Can I please ping this one again? It's 3 lines or so to fix the PR. Thanks!
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
>
> On Tue, Dec 19, 2023 at 6:20 PM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > May I please ping this one? Thanks...
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
> >
> > -Lewis
> >
> > On Wed, Nov 29, 2023 at 7:05 PM Lewis Hyatt  wrote:
> > >
> > > On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> > > >
> > > > This patch fixes the behavior of `#pragma GCC diagnostic pop' for 
> > > > permissive
> > > > error diagnostics such as -Wnarrowing (in C++11). Those currently do not
> > > > return to the correct state after the last pop; they become effectively
> > > > simple warnings instead. Bootstrap + regtest all languages on x86-64, 
> > > > does
> > > > it look OK please? Thanks!
> > >
> > > Hello-
> > >
> > > May I please ping this bug fix?
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html
> > >
> > > Please note, it requires a trivial rebase on top of recent changes to
> > > the class diagnostic_context public interface. I attached the rebased 
> > > patch
> > > here as well. Thanks!
> > >
> > > -Lewis


[PATCH] libcpp: Support extended characters for #pragma {push, pop}_macro [PR109704]

2024-01-13 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704

The below patch fixes the issue noted in the PR that extended characters
cannot appear in the identifier passed to a #pragma push_macro or #pragma
pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for
GCC 13 please?

I know we just entered stage 4, however I feel this is kinda like an old
regression, given that the issue was not apparent until support for UCNs and
UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into
GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this
release. (The other major one was for extended characters in a user-defined
literal, that was fixed by r14-2629).

Speaking of just entering stage 4. I do have 4 really short patches sent
over the past several months that never got any response. Is there any
chance someone may have a few minutes to look at them please? They are
really just like 1-3 line fixes for PRs.

libcpp (pinged once recently):
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html

diagnostics (pinged for 3rd time last week):
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

c-family/PCH (pinged a month ago):
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html

Thanks!

-Lewis

-- >8 --

The implementation of #pragma push_macro and #pragma pop_macro has to date
made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
identifier out of a string. When support was added for extended characters
in identifiers ($, UCNs, or UTF-8), that support was added only for the
"normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
not for the ad-hoc way. Consequently, extended identifiers are not usable
with these pragmas.

The logic for lexing identifiers has become more complicated than it was
when _cpp_lex_identifier() was written -- it now handles things like \N{}
escapes in C++, for instance -- and it no longer seems practical to maintain
a redundant code path for lexing identifiers. Address the issue by changing
the implementation of #pragma {push,pop}_macro to lex identifiers in the
expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
there.

The existing implementation has some quirks because of the ad-hoc parsing
logic. For example:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X")

will not restore macro X (note the extra space in the first string). However:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X ")

actually does sucessfully restore "X". This is because the key for looking
up the saved macro on the push stack is the original string passed, so the
string passed to pop_macro needs to match it exactly. It is not that easy to
reproduce this logic in the world of extended characters, given that for
example it should be valid to pass a UCN to push_macro, and the
corresponding UTF-8 to pop_macro. Given that this aspect of the existing
behavior seems unintentional and has no tests (and does not match other
implementations), I opted to make the new logic more straightforward. The
string passed needs to lex to one token, which must be a valid identifier,
or else no action is taken and no error is generated. Any diagnostics
encountered during lexing (e.g., due to a UTF-8 character not permitted to
appear in an identifier) are also suppressed.

It could be nice (for GCC 15) to also add a warning if a pop_macro does not
match a previous push_macro.

libcpp/ChangeLog:

PR preprocessor/109704
* include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
* errors.cc
(cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
function.
(cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
function.
* charset.cc (noop_diagnostic_cb): Remove.
(cpp_interpret_string_ranges): Refactor diagnostic suppression logic
into new class cpp_auto_suppress_diagnostics.
(count_source_chars): Likewise.
* directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
(lex_identifier_from_string): New static helper function.
(push_pop_macro_common): Refactor common logic from
do_pragma_push_macro and do_pragma_pop_macro; use
lex_identifier_from_string instead of _cpp_lex_identifier.
(do_pragma_push_macro): Reimplement using push_pop_macro_common.
(do_pragma_pop_macro): Likewise.
* internal.h (_cpp_lex_identifier): Remove.
* lex.cc (lex_identifier_intern): Remove.
(_cpp_lex_identifier): Remove.

gcc/testsuite/ChangeLog:

PR preprocessor/109704
* c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
* g++.dg/pch/pushpop-2.C: New test.
* g++.dg/pch/pushpop-2.Hs: New test.
* gcc.dg/pch/pushpop-2.c: New test.
* gcc.dg/pch/pushpop-2.hs: New test.
---
 libcpp/charset.cc   

ping: [PATCH] libcpp: Fix __has_include_next ICE in the last directory of the path [PR80755]

2024-01-11 Thread Lewis Hyatt
Can I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html

-Lewis

On Thu, Dec 21, 2023 at 7:37 AM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755
>
> Here is a short fix for the ICE in libcpp noted in the PR. Bootstrap +
> regtest all languages on x86-64 Linux. Is it OK please? Thanks!
>
> -Lewis
>
> -- >8 --
>
> In libcpp/files.cc, the function _cpp_has_header(), which implements
> __has_include and __has_include_next, does not check for a NULL return value
> from search_path_head(), leading to an ICE tripping an assert when
> _cpp_find_file() tries to use it. Fix it by checking for that case and
> silently returning false instead.
>
> As suggested by the PR author, it is easiest to make a testcase by using
> the -idirafter option. To enable that, also modify the dg-additional-options
> testsuite procedure to make the global $srcdir available, since -idirafter
> requires the full path.
>
> libcpp/ChangeLog:
>
> PR preprocessor/80755
> * files.cc (search_path_head): Add SUPPRESS_DIAGNOSTIC argument
> defaulting to false.
> (_cpp_has_header): Silently return false if the search path has been
> exhausted, rather than issuing a diagnostic and then hitting an
> assert.
>
> gcc/testsuite/ChangeLog:
>
> * lib/gcc-defs.exp (dg-additional-options): Make $srcdir usable in a
> dg-additional-options directive.
> * c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h: New 
> test.
> * c-c++-common/cpp/has-include-next-2.c: New test.
> ---
>  libcpp/files.cc  | 12 
>  .../cpp/has-include-next-2-dir/has-include-next-2.h  |  3 +++
>  gcc/testsuite/c-c++-common/cpp/has-include-next-2.c  |  4 
>  gcc/testsuite/lib/gcc-defs.exp   |  1 +
>  4 files changed, 16 insertions(+), 4 deletions(-)
>  create mode 100644 
> gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
>
> diff --git a/libcpp/files.cc b/libcpp/files.cc
> index 27301d79fa4..aaab4b13a6a 100644
> --- a/libcpp/files.cc
> +++ b/libcpp/files.cc
> @@ -181,7 +181,8 @@ static bool read_file_guts (cpp_reader *pfile, _cpp_file 
> *file,
>  static bool read_file (cpp_reader *pfile, _cpp_file *file,
>location_t loc);
>  static struct cpp_dir *search_path_head (cpp_reader *, const char *fname,
> -int angle_brackets, enum include_type);
> +int angle_brackets, enum 
> include_type,
> +bool suppress_diagnostic = false);
>  static const char *dir_name_of_file (_cpp_file *file);
>  static void open_file_failed (cpp_reader *pfile, _cpp_file *file, int,
>   location_t);
> @@ -1041,7 +1042,7 @@ _cpp_mark_file_once_only (cpp_reader *pfile, _cpp_file 
> *file)
> nothing left in the path, returns NULL.  */
>  static struct cpp_dir *
>  search_path_head (cpp_reader *pfile, const char *fname, int angle_brackets,
> - enum include_type type)
> + enum include_type type, bool suppress_diagnostic)
>  {
>cpp_dir *dir;
>_cpp_file *file;
> @@ -1070,7 +1071,7 @@ search_path_head (cpp_reader *pfile, const char *fname, 
> int angle_brackets,
>  return make_cpp_dir (pfile, dir_name_of_file (file),
>  pfile->buffer ? pfile->buffer->sysp : 0);
>
> -  if (dir == NULL)
> +  if (dir == NULL && !suppress_diagnostic)
>  cpp_error (pfile, CPP_DL_ERROR,
>"no include path in which to search for %s", fname);
>
> @@ -2164,7 +2165,10 @@ bool
>  _cpp_has_header (cpp_reader *pfile, const char *fname, int angle_brackets,
>  enum include_type type)
>  {
> -  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type);
> +  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type,
> +/* suppress_diagnostic = */ true);
> +  if (!start_dir)
> +return false;
>_cpp_file *file = _cpp_find_file (pfile, fname, start_dir, angle_brackets,
> _cpp_FFK_HAS_INCLUDE, 0);
>return file->err_no != ENOENT;
> diff --git 
> a/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h 
> b/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
> new file mode 100644
> index 000..1e4be6ce7a3
> --- /dev/null
> +++ 
> b/gcc/testsu

ping^3: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2024-01-08 Thread Lewis Hyatt
Can I please ping this one again? It's 3 lines or so to fix the PR. Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

On Tue, Dec 19, 2023 at 6:20 PM Lewis Hyatt  wrote:
>
> Hello-
>
> May I please ping this one? Thanks...
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
>
> -Lewis
>
> On Wed, Nov 29, 2023 at 7:05 PM Lewis Hyatt  wrote:
> >
> > On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> > >
> > > This patch fixes the behavior of `#pragma GCC diagnostic pop' for 
> > > permissive
> > > error diagnostics such as -Wnarrowing (in C++11). Those currently do not
> > > return to the correct state after the last pop; they become effectively
> > > simple warnings instead. Bootstrap + regtest all languages on x86-64, does
> > > it look OK please? Thanks!
> >
> > Hello-
> >
> > May I please ping this bug fix?
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html
> >
> > Please note, it requires a trivial rebase on top of recent changes to
> > the class diagnostic_context public interface. I attached the rebased patch
> > here as well. Thanks!
> >
> > -Lewis


Re: [PATCH v2] libstdc++: Add Unicode-aware width estimation for std::format

2024-01-06 Thread Lewis Hyatt
On Sat, Jan 6, 2024 at 11:40 AM Jonathan Wakely  wrote:
>
> Here's a V2 patch which addresses the two things I mentioned: the new
> Python script now generates a complete file that can just be included by
> , and the full Unicode 15.1.0 grapheme cluster break
> rules are supported (I think ... more testing needed for some of the
> complex rules).
>
> -- >8 --

Thanks, by the way, for fixing the typo in gen_wcwidth.py.
One thing I wanted to point out, the file contrib/unicode/README
contains a list of steps to follow in order to update to a new Unicode
version. There are 10 or so steps to generate everything libcpp and
diagnostics care about. Do you think it's worth adding something for
the new libstdc++ parts there too? I guess it may not be desirable to
update them always at the same time though.

-Lewis


ping: [PATCH] libcpp: Fix macro expansion for argument of __has_include [PR110558]

2024-01-03 Thread Lewis Hyatt
Hello-

May I please ping this one? Thanks...

https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html

-Lewis

On Tue, Dec 12, 2023 at 6:18 PM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558
>
> This is a small fix for the libcpp issue noted in the PR. Bootstrap +
> regtest all languages on x86-64 Linux. Is it ok for trunk please?
>
> Also, it's not a regression, having never worked since __has_include was
> introduced in GCC 5, but FWIW the fix would backport fine to all branches
> since then... so I think backport to 11,12,13 would make sense assuming the
> patch is OK. Thanks!
>
> -Lewis
>
> -- >8 --
>
> When the file name for a #include directive is the result of stringifying a
> macro argument, libcpp needs to take some care to get the whitespace
> correct; in particular stringify_arg() needs to see a CPP_PADDING token
> between macro tokens so that it can figure out when to output space between
> tokens. The CPP_PADDING tokens are not normally generated when handling a
> preprocessor directive, but for #include-like directives, libcpp sets the
> state variable pfile->state.directive_wants_padding to TRUE so that the
> CPP_PADDING tokens will be output, and then everything works fine for
> computed includes.
>
> As the PR points out, things do not work fine for __has_include. Fix that by
> setting the state variable the same as is done for #include.
>
> libcpp/ChangeLog:
>
> PR preprocessor/110558
> * macro.cc (builtin_has_include): Set
> pfile->state.directive_wants_padding prior to lexing the
> file name, in case it comes from macro expansion.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/110558
> * c-c++-common/cpp/has-include-2.c: New test.
> * c-c++-common/cpp/has-include-2.h: New test.
> ---
>  libcpp/macro.cc|  3 +++
>  gcc/testsuite/c-c++-common/cpp/has-include-2.c | 12 
>  gcc/testsuite/c-c++-common/cpp/has-include-2.h |  1 +
>  3 files changed, 16 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-2.c
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-2.h
>
> diff --git a/libcpp/macro.cc b/libcpp/macro.cc
> index 6f24a9d6f3a..15140c60023 100644
> --- a/libcpp/macro.cc
> +++ b/libcpp/macro.cc
> @@ -398,6 +398,8 @@ builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, 
> bool has_next)
>NODE_NAME (op));
>
>pfile->state.angled_headers = true;
> +  const auto sav_padding = pfile->state.directive_wants_padding;
> +  pfile->state.directive_wants_padding = true;
>const cpp_token *token = cpp_get_token_no_padding (pfile);
>bool paren = token->type == CPP_OPEN_PAREN;
>if (paren)
> @@ -406,6 +408,7 @@ builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, 
> bool has_next)
>  cpp_error (pfile, CPP_DL_ERROR,
>"missing '(' before \"%s\" operand", NODE_NAME (op));
>pfile->state.angled_headers = false;
> +  pfile->state.directive_wants_padding = sav_padding;
>
>bool bracket = token->type != CPP_STRING;
>char *fname = NULL;
> diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-2.c 
> b/gcc/testsuite/c-c++-common/cpp/has-include-2.c
> new file mode 100644
> index 000..5cd00cb3fb5
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/has-include-2.c
> @@ -0,0 +1,12 @@
> +/* PR preprocessor/110558 */
> +/* { dg-do preprocess } */
> +#define STRINGIZE(x) #x
> +#define GET_INCLUDE(i) STRINGIZE(has-include-i.h)
> +/* Spaces surrounding the macro args previously caused a problem for 
> __has_include().  */
> +#if __has_include(GET_INCLUDE(2)) && __has_include(GET_INCLUDE( 2)) && 
> __has_include(GET_INCLUDE( 2 ))
> +#include GET_INCLUDE(2)
> +#include GET_INCLUDE( 2)
> +#include GET_INCLUDE( 2 )
> +#else
> +#error "__has_include did not handle padding properly" /* { dg-bogus 
> "__has_include" } */
> +#endif
> diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-2.h 
> b/gcc/testsuite/c-c++-common/cpp/has-include-2.h
> new file mode 100644
> index 000..57c402b32a8
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/has-include-2.h
> @@ -0,0 +1 @@
> +/* PR preprocessor/110558 */


[PATCH] libcpp: Fix __has_include_next ICE in the last directory of the path [PR80755]

2023-12-21 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755

Here is a short fix for the ICE in libcpp noted in the PR. Bootstrap +
regtest all languages on x86-64 Linux. Is it OK please? Thanks!

-Lewis

-- >8 --

In libcpp/files.cc, the function _cpp_has_header(), which implements
__has_include and __has_include_next, does not check for a NULL return value
from search_path_head(), leading to an ICE tripping an assert when
_cpp_find_file() tries to use it. Fix it by checking for that case and
silently returning false instead.

As suggested by the PR author, it is easiest to make a testcase by using
the -idirafter option. To enable that, also modify the dg-additional-options
testsuite procedure to make the global $srcdir available, since -idirafter
requires the full path.

libcpp/ChangeLog:

PR preprocessor/80755
* files.cc (search_path_head): Add SUPPRESS_DIAGNOSTIC argument
defaulting to false.
(_cpp_has_header): Silently return false if the search path has been
exhausted, rather than issuing a diagnostic and then hitting an
assert.

gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp (dg-additional-options): Make $srcdir usable in a
dg-additional-options directive.
* c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h: New 
test.
* c-c++-common/cpp/has-include-next-2.c: New test.
---
 libcpp/files.cc  | 12 
 .../cpp/has-include-next-2-dir/has-include-next-2.h  |  3 +++
 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c  |  4 
 gcc/testsuite/lib/gcc-defs.exp   |  1 +
 4 files changed, 16 insertions(+), 4 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
 create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c

diff --git a/libcpp/files.cc b/libcpp/files.cc
index 27301d79fa4..aaab4b13a6a 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -181,7 +181,8 @@ static bool read_file_guts (cpp_reader *pfile, _cpp_file 
*file,
 static bool read_file (cpp_reader *pfile, _cpp_file *file,
   location_t loc);
 static struct cpp_dir *search_path_head (cpp_reader *, const char *fname,
-int angle_brackets, enum include_type);
+int angle_brackets, enum include_type,
+bool suppress_diagnostic = false);
 static const char *dir_name_of_file (_cpp_file *file);
 static void open_file_failed (cpp_reader *pfile, _cpp_file *file, int,
  location_t);
@@ -1041,7 +1042,7 @@ _cpp_mark_file_once_only (cpp_reader *pfile, _cpp_file 
*file)
nothing left in the path, returns NULL.  */
 static struct cpp_dir *
 search_path_head (cpp_reader *pfile, const char *fname, int angle_brackets,
- enum include_type type)
+ enum include_type type, bool suppress_diagnostic)
 {
   cpp_dir *dir;
   _cpp_file *file;
@@ -1070,7 +1071,7 @@ search_path_head (cpp_reader *pfile, const char *fname, 
int angle_brackets,
 return make_cpp_dir (pfile, dir_name_of_file (file),
 pfile->buffer ? pfile->buffer->sysp : 0);
 
-  if (dir == NULL)
+  if (dir == NULL && !suppress_diagnostic)
 cpp_error (pfile, CPP_DL_ERROR,
   "no include path in which to search for %s", fname);
 
@@ -2164,7 +2165,10 @@ bool
 _cpp_has_header (cpp_reader *pfile, const char *fname, int angle_brackets,
 enum include_type type)
 {
-  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type);
+  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type,
+/* suppress_diagnostic = */ true);
+  if (!start_dir)
+return false;
   _cpp_file *file = _cpp_find_file (pfile, fname, start_dir, angle_brackets,
_cpp_FFK_HAS_INCLUDE, 0);
   return file->err_no != ENOENT;
diff --git 
a/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h 
b/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
new file mode 100644
index 000..1e4be6ce7a3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
@@ -0,0 +1,3 @@
+#if __has_include_next()
+/* This formerly led to an ICE when the current directory was the last one in 
the path.  */
+#endif
diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c 
b/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
new file mode 100644
index 000..4928d3e992c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
@@ -0,0 +1,4 @@
+/* PR preprocessor/80755 */
+/* { dg-do preprocess } */
+/* { dg-additional-options "-idirafter 
$srcdir/c-c++-common/cpp/has-include-next-2-dir" } */
+#include 
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index 

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2023-12-20 Thread Lewis Hyatt
Hello-

May I please ping this PCH patch? Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html

-Lewis

On Tue, Dec 5, 2023 at 8:52 PM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
>
> There are two related issues here really, a regression since GCC 11 where we
> can ICE after restoring a PCH, and a deeper issue with bogus locations
> assigned to macros that were defined prior to restoring a PCH.  This patch
> fixes the ICE regression with a simple change, and I think it's appropriate
> for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
> not generally causing an ICE, and mostly affecting only the output of
> -Wunused-macros) are not as problematic, and will be harder to fix. I could
> take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> tests for the wrong locations (as well as passing tests for the regression
> fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
> Linux. Thanks!
>
> -Lewis
>
> -- >8 --
>
> Users are allowed to define macros prior to restoring a precompiled header
> file, as long as those macros are not defined (or are defined identically)
> in the PCH.  However, the PCH restoration process destroys all the macro
> definitions, so libcpp has to record them before restoring the PCH and then
> redefine them afterward.
>
> This process does not currently assign great locations to the macros after
> redefining them. Some work is needed to also remember the original locations
> and get the line_maps instance in the right state (since, like all other
> data structures, the line_maps instance is also reset after restoring a PCH).
> The new testcase line-map-3.C contains XFAILed examples where the locations
> are wrong.
>
> This patch addresses a more pressing issue, which is that we ICE in some
> cases since GCC 11, hitting an assert in line-maps.cc. It happens if the
> first line encountered after the PCH restore requires an LC_RENAME map, such
> as will happen if the line is sufficiently long.  This is much easier to
> fix, since we just need to call linemap_line_start before asking libcpp to
> redefine the stored macros, instead of afterward, to avoid the unexpected
> need for an LC_RENAME before an LC_ENTER has been seen.
>
> gcc/c-family/ChangeLog:
>
> PR preprocessor/105608
> * c-pch.cc (c_common_read_pch): Start a new line map before asking
> libcpp to restore macros defined prior to reading the PCH, instead
> of afterward.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/105608
> * g++.dg/pch/line-map-1.C: New test.
> * g++.dg/pch/line-map-1.Hs: New test.
> * g++.dg/pch/line-map-2.C: New test.
> * g++.dg/pch/line-map-2.Hs: New test.
> * g++.dg/pch/line-map-3.C: New test.
> * g++.dg/pch/line-map-3.Hs: New test.
> ---
>  gcc/c-family/c-pch.cc  |  5 ++---
>  gcc/testsuite/g++.dg/pch/line-map-1.C  |  4 
>  gcc/testsuite/g++.dg/pch/line-map-1.Hs |  1 +
>  gcc/testsuite/g++.dg/pch/line-map-2.C  |  6 ++
>  gcc/testsuite/g++.dg/pch/line-map-2.Hs |  1 +
>  gcc/testsuite/g++.dg/pch/line-map-3.C  | 23 +++
>  gcc/testsuite/g++.dg/pch/line-map-3.Hs |  1 +
>  7 files changed, 38 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.Hs
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.Hs
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.Hs
>
> diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
> index 2f014fca210..9ee6f179002 100644
> --- a/gcc/c-family/c-pch.cc
> +++ b/gcc/c-family/c-pch.cc
> @@ -342,6 +342,8 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
>gt_pch_restore (f);
>cpp_set_line_map (pfile, line_table);
>rebuild_location_adhoc_htab (line_table);
> +  line_table->trace_includes = saved_trace_includes;
> +  linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
>
>timevar_push (TV_PCH_CPP_RESTORE);
>if (cpp_read_state (pfile, name, f, smd) != 0)
> @@ -355,9 +357,6 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
>
>fclose (f);
>
> -  line_table->trace_includes = saved_trace_includes;
> -  linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
> -
>/* Give the front end a chance to take action after a PCH file has
>   been loaded.  */
>if (lang_post_pch_load)
> diff --git a/gcc/testsuite/g++.dg/pch/line-

ping^2: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2023-12-19 Thread Lewis Hyatt
Hello-

May I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

-Lewis

On Wed, Nov 29, 2023 at 7:05 PM Lewis Hyatt  wrote:
>
> On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> >
> > This patch fixes the behavior of `#pragma GCC diagnostic pop' for permissive
> > error diagnostics such as -Wnarrowing (in C++11). Those currently do not
> > return to the correct state after the last pop; they become effectively
> > simple warnings instead. Bootstrap + regtest all languages on x86-64, does
> > it look OK please? Thanks!
>
> Hello-
>
> May I please ping this bug fix?
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html
>
> Please note, it requires a trivial rebase on top of recent changes to
> the class diagnostic_context public interface. I attached the rebased patch
> here as well. Thanks!
>
> -Lewis


[PATCH] libcpp: Fix macro expansion for argument of __has_include [PR110558]

2023-12-12 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558

This is a small fix for the libcpp issue noted in the PR. Bootstrap +
regtest all languages on x86-64 Linux. Is it ok for trunk please?

Also, it's not a regression, having never worked since __has_include was
introduced in GCC 5, but FWIW the fix would backport fine to all branches
since then... so I think backport to 11,12,13 would make sense assuming the
patch is OK. Thanks!

-Lewis

-- >8 --

When the file name for a #include directive is the result of stringifying a
macro argument, libcpp needs to take some care to get the whitespace
correct; in particular stringify_arg() needs to see a CPP_PADDING token
between macro tokens so that it can figure out when to output space between
tokens. The CPP_PADDING tokens are not normally generated when handling a
preprocessor directive, but for #include-like directives, libcpp sets the
state variable pfile->state.directive_wants_padding to TRUE so that the
CPP_PADDING tokens will be output, and then everything works fine for
computed includes.

As the PR points out, things do not work fine for __has_include. Fix that by
setting the state variable the same as is done for #include.

libcpp/ChangeLog:

PR preprocessor/110558
* macro.cc (builtin_has_include): Set
pfile->state.directive_wants_padding prior to lexing the
file name, in case it comes from macro expansion.

gcc/testsuite/ChangeLog:

PR preprocessor/110558
* c-c++-common/cpp/has-include-2.c: New test.
* c-c++-common/cpp/has-include-2.h: New test.
---
 libcpp/macro.cc|  3 +++
 gcc/testsuite/c-c++-common/cpp/has-include-2.c | 12 
 gcc/testsuite/c-c++-common/cpp/has-include-2.h |  1 +
 3 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-2.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-2.h

diff --git a/libcpp/macro.cc b/libcpp/macro.cc
index 6f24a9d6f3a..15140c60023 100644
--- a/libcpp/macro.cc
+++ b/libcpp/macro.cc
@@ -398,6 +398,8 @@ builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, 
bool has_next)
   NODE_NAME (op));
 
   pfile->state.angled_headers = true;
+  const auto sav_padding = pfile->state.directive_wants_padding;
+  pfile->state.directive_wants_padding = true;
   const cpp_token *token = cpp_get_token_no_padding (pfile);
   bool paren = token->type == CPP_OPEN_PAREN;
   if (paren)
@@ -406,6 +408,7 @@ builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, 
bool has_next)
 cpp_error (pfile, CPP_DL_ERROR,
   "missing '(' before \"%s\" operand", NODE_NAME (op));
   pfile->state.angled_headers = false;
+  pfile->state.directive_wants_padding = sav_padding;
 
   bool bracket = token->type != CPP_STRING;
   char *fname = NULL;
diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-2.c 
b/gcc/testsuite/c-c++-common/cpp/has-include-2.c
new file mode 100644
index 000..5cd00cb3fb5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-2.c
@@ -0,0 +1,12 @@
+/* PR preprocessor/110558 */
+/* { dg-do preprocess } */
+#define STRINGIZE(x) #x
+#define GET_INCLUDE(i) STRINGIZE(has-include-i.h)
+/* Spaces surrounding the macro args previously caused a problem for 
__has_include().  */
+#if __has_include(GET_INCLUDE(2)) && __has_include(GET_INCLUDE( 2)) && 
__has_include(GET_INCLUDE( 2 ))
+#include GET_INCLUDE(2)
+#include GET_INCLUDE( 2)
+#include GET_INCLUDE( 2 )
+#else
+#error "__has_include did not handle padding properly" /* { dg-bogus 
"__has_include" } */
+#endif
diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-2.h 
b/gcc/testsuite/c-c++-common/cpp/has-include-2.h
new file mode 100644
index 000..57c402b32a8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-2.h
@@ -0,0 +1 @@
+/* PR preprocessor/110558 */


[PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2023-12-05 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

There are two related issues here really, a regression since GCC 11 where we
can ICE after restoring a PCH, and a deeper issue with bogus locations
assigned to macros that were defined prior to restoring a PCH.  This patch
fixes the ICE regression with a simple change, and I think it's appropriate
for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
not generally causing an ICE, and mostly affecting only the output of
-Wunused-macros) are not as problematic, and will be harder to fix. I could
take a stab at that for GCC 15. In the meantime the patch adds XFAILed
tests for the wrong locations (as well as passing tests for the regression
fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
Linux. Thanks!

-Lewis

-- >8 --

Users are allowed to define macros prior to restoring a precompiled header
file, as long as those macros are not defined (or are defined identically)
in the PCH.  However, the PCH restoration process destroys all the macro
definitions, so libcpp has to record them before restoring the PCH and then
redefine them afterward.

This process does not currently assign great locations to the macros after
redefining them. Some work is needed to also remember the original locations
and get the line_maps instance in the right state (since, like all other
data structures, the line_maps instance is also reset after restoring a PCH).
The new testcase line-map-3.C contains XFAILed examples where the locations
are wrong.

This patch addresses a more pressing issue, which is that we ICE in some
cases since GCC 11, hitting an assert in line-maps.cc. It happens if the
first line encountered after the PCH restore requires an LC_RENAME map, such
as will happen if the line is sufficiently long.  This is much easier to
fix, since we just need to call linemap_line_start before asking libcpp to
redefine the stored macros, instead of afterward, to avoid the unexpected
need for an LC_RENAME before an LC_ENTER has been seen.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Start a new line map before asking
libcpp to restore macros defined prior to reading the PCH, instead
of afterward.

gcc/testsuite/ChangeLog:

PR preprocessor/105608
* g++.dg/pch/line-map-1.C: New test.
* g++.dg/pch/line-map-1.Hs: New test.
* g++.dg/pch/line-map-2.C: New test.
* g++.dg/pch/line-map-2.Hs: New test.
* g++.dg/pch/line-map-3.C: New test.
* g++.dg/pch/line-map-3.Hs: New test.
---
 gcc/c-family/c-pch.cc  |  5 ++---
 gcc/testsuite/g++.dg/pch/line-map-1.C  |  4 
 gcc/testsuite/g++.dg/pch/line-map-1.Hs |  1 +
 gcc/testsuite/g++.dg/pch/line-map-2.C  |  6 ++
 gcc/testsuite/g++.dg/pch/line-map-2.Hs |  1 +
 gcc/testsuite/g++.dg/pch/line-map-3.C  | 23 +++
 gcc/testsuite/g++.dg/pch/line-map-3.Hs |  1 +
 7 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.C
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-1.Hs
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.C
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-2.Hs
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.C
 create mode 100644 gcc/testsuite/g++.dg/pch/line-map-3.Hs

diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
index 2f014fca210..9ee6f179002 100644
--- a/gcc/c-family/c-pch.cc
+++ b/gcc/c-family/c-pch.cc
@@ -342,6 +342,8 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   gt_pch_restore (f);
   cpp_set_line_map (pfile, line_table);
   rebuild_location_adhoc_htab (line_table);
+  line_table->trace_includes = saved_trace_includes;
+  linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
 
   timevar_push (TV_PCH_CPP_RESTORE);
   if (cpp_read_state (pfile, name, f, smd) != 0)
@@ -355,9 +357,6 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
 
   fclose (f);
 
-  line_table->trace_includes = saved_trace_includes;
-  linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
-
   /* Give the front end a chance to take action after a PCH file has
  been loaded.  */
   if (lang_post_pch_load)
diff --git a/gcc/testsuite/g++.dg/pch/line-map-1.C 
b/gcc/testsuite/g++.dg/pch/line-map-1.C
new file mode 100644
index 000..9d1ac6d1683
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/line-map-1.C
@@ -0,0 +1,4 @@
+/* PR preprocessor/105608 */
+/* { dg-do compile } */
+#define MACRO_ON_A_LONG_LINE "this line is long enough that it forces the line 
table to create an LC_RENAME map, which formerly triggered an ICE after PCH 
restore"
+#include "line-map-1.H"
diff --git a/gcc/testsuite/g++.dg/pch/line-map-1.Hs 
b/gcc/testsuite/g++.dg/pch/line-map-1.Hs
new file mode 100644
index 000..3b6178bfae0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/line-map-1.Hs
@@ -0,0 +1 @@
+/* This space intentionally 

Re: [PATCH] preprocessor: Reinitialize frontend parser after loading a PCH [PR112319]

2023-11-30 Thread Lewis Hyatt
On Thu, Nov 30, 2023 at 4:19 PM Marek Polacek  wrote:
>
> On Wed, Nov 01, 2023 at 05:54:57PM -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112319
> >
> > This is a one-line patch to fix the GCC 14 regression noted in the
> > PR. Bootstrap + regtest all languages on x86-64 looks good. Is it OK please?
> > Thanks!
> >
> > -Lewis
> >
> > -- >8 --
> >
> > Since r14-2893, the frontend parser object needs to exist when running in
> > preprocess-only mode, because pragma_lex() is now called in that mode and
> > needs to make use of it. This is handled by calling c_init_preprocess() at
> > startup. If -fpch-preprocess is in effect (commonly, because of
> > -save-temps), a PCH file may be loaded during preprocessing, in which
> > case the parser will be destroyed, causing the issue noted in the
> > PR. Resolve it by reinitializing the frontend parser after loading the PCH.
> >
> > gcc/c-family/ChangeLog:
> >
> >   PR pch/112319
> >   * c-ppoutput.cc (cb_read_pch): Reinitialize the frontend parser
>
> "front-end"
>
> >   after loading a PCH.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR pch/112319
> >   * g++.dg/pch/pr112319.C: New test.
> >   * g++.dg/pch/pr112319.Hs: New test.
> >   * gcc.dg/pch/pr112319.c: New test.
> >   * gcc.dg/pch/pr112319.hs: New test.
> > ---
> >  gcc/c-family/c-ppoutput.cc   | 5 +
> >  gcc/testsuite/g++.dg/pch/pr112319.C  | 5 +
> >  gcc/testsuite/g++.dg/pch/pr112319.Hs | 1 +
> >  gcc/testsuite/gcc.dg/pch/pr112319.c  | 5 +
> >  gcc/testsuite/gcc.dg/pch/pr112319.hs | 1 +
> >  5 files changed, 17 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.Hs
> >  create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.c
> >  create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.hs
> >
> > diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
> > index 4aa2bef2c0f..4f973767976 100644
> > --- a/gcc/c-family/c-ppoutput.cc
> > +++ b/gcc/c-family/c-ppoutput.cc
> > @@ -862,4 +862,9 @@ cb_read_pch (cpp_reader *pfile, const char *name,
> >
> >fprintf (print.outf, "#pragma GCC pch_preprocess \"%s\"\n", name);
> >print.src_line++;
> > +
> > +  /* The process of reading the PCH has destroyed the frontend parser,
>
> "front-end"
>
> > + so ask the frontend to reinitialize it, in case we need it to
>
> "front end"
>
> (sorry to be overly pedantic...)
>
> Patch looks fine to me; please go ahead if you haven't pushed it already.
>

Thanks for the review! I did push it a few days ago as Jakub approved
it... I will spell front end correctly next time :).

-Lewis


ping: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2023-11-29 Thread Lewis Hyatt
On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> 
> This patch fixes the behavior of `#pragma GCC diagnostic pop' for permissive
> error diagnostics such as -Wnarrowing (in C++11). Those currently do not
> return to the correct state after the last pop; they become effectively
> simple warnings instead. Bootstrap + regtest all languages on x86-64, does
> it look OK please? Thanks!

Hello-

May I please ping this bug fix?
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html

Please note, it requires a trivial rebase on top of recent changes to
the class diagnostic_context public interface. I attached the rebased patch
here as well. Thanks!

-Lewis
When a diagnostic pragma changes the classification of a given diagnostic,
the global options flags (such as warn_narrowing, etc.) may get changed too.
Specifically, if a warning was not enabled initially and was later enabled
by a pragma, then the corresponding global flag will change from false to
true when the pragma is processed. That change is permanent and is not
undone by a subsequent `#pragma GCC diagnostic pop'; the warning flag needs
to remain enabled since a diagnostic could be generated later on for a
source location prior to the pop.

So in order to support popping to the initial classification, given that the
global options flags no longer reflect that state, the diagnostic_context
object itself remembers the way things were before it changed anything. The
current implementation works fine for diagnostics that are always errors or
always warnings, but it doesn't do the right thing for diagnostics that
could be either, such as -Wnarrowing. The classification of that diagnostic
(or any permerror diagnostic) depends on the state of -fpermissive; for the
particular case of -Wnarrowing it also matters whether a compile-time or
run-time narrowing is being diagnosed.

The problem is that the current implementation insists on recording whether
an enabled diagnostic should be a DK_WARNING or a DK_ERROR, and then, after
popping to the initial state, it overrides it always to that type only. Fix
that up by adding a new internal diagnostic type DK_ANY. This just indicates
that the diagnostic is enabled without mandating exactly what type of
diagnostic it should be. Then the diagnostic can be emitted with whatever
type the frontend asks for.

Incidentally, while making this change, I noticed that classify_diagnostic()
spends some time computing a return value (the old classification kind) that
is not used anywhere. The computed value seems to have some problems, mainly
that it does not take into account `#pragma GCC diagnostic pop' at all, and
so the returned value doesn't seem like it could make sense in many
contexts. Given it would also not be desirable to leak the new internal-only
DK_ANY type to outside callers, I think it would make sense in a subsequent
cleanup patch to remove the return value altogether.

gcc/ChangeLog:

PR c++/111918
* diagnostic-core.h (enum diagnostic_t): Add DK_ANY special flag.
* diagnostic.cc (diagnostic_option_classifier::classify_diagnostic):
Make use of DK_ANY to indicate a diagnostic was initially enabled.
(diagnostic_context::diagnostic_enabled): Do not change the type of
a diagnostic if the saved classification is type DK_ANY.

gcc/testsuite/ChangeLog:

PR c++/111918
* g++.dg/cpp0x/Wnarrowing21a.C: New test.
* g++.dg/cpp0x/Wnarrowing21b.C: New test.
* g++.dg/cpp0x/Wnarrowing21c.C: New test.
* g++.dg/cpp0x/Wnarrowing21d.C: New test.

diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 04eba3d140e..4926c48da96 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -33,7 +33,10 @@ typedef enum
   DK_LAST_DIAGNOSTIC_KIND,
   /* This is used for tagging pragma pops in the diagnostic
  classification history chain.  */
-  DK_POP
+  DK_POP,
+  /* This is used internally to note that a diagnostic is enabled
+ without mandating any specific type.  */
+  DK_ANY,
 } diagnostic_t;
 
 /* RAII-style class for grouping related diagnostics.  */
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 4f66fa6acaa..fd40018a734 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1136,8 +1136,7 @@ classify_diagnostic (const diagnostic_context *context,
   if (old_kind == DK_UNSPECIFIED)
{
  old_kind = !context->option_enabled_p (option_index)
-   ? DK_IGNORED : (context->warning_as_error_requested_p ()
-   ? DK_ERROR : DK_WARNING);
+   ? DK_IGNORED : DK_ANY;
  m_classify_diagnostic[option_index] = old_kind;
}
 
@@ -1472,7 +1471,15 @@ diagnostic_context::diagnostic_enabled (diagnostic_info 
*diagnostic)
  option.  */
   if (diag_class == DK_UNSPECIFIED
   && !option_unspecified_p (diagnostic-&g

[PATCH] libcpp: Fix unsigned promotion for unevaluated divide by zero [PR112701]

2023-11-27 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701

Here is a one-line fix to an edge case in libcpp's expression evaluator
noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it OK
please? Thanks!

-Lewis

-- >8 --

When libcpp encounters a divide by zero while processing a constant
expression "x/y", it returns "x" as a fallback. The value of the fallback is
not normally important, since an error will be generated anyway, but if the
expression appears in an unevaluated context, such as "0 ? 0/0u : -1", then
there will be no error, and the fallback value will be meaningful to the
extent that it may cause promotion from signed to unsigned of an operand
encountered later. As the PR notes, libcpp does not do the unsigned
promotion correctly in this case; fix it by making the fallback return value
unsigned as necessary.

libcpp/ChangeLog:

PR preprocessor/112701
* expr.cc (num_div_op): Set unsignedp appropriately when returning a
stub value for divide by 0.

gcc/testsuite/ChangeLog:

PR preprocessor/112701
* gcc.dg/cpp/expr.c: Add additional tests to cover divide by 0 in an
unevaluated context, where the unsignedness still matters.
---
 libcpp/expr.cc  |  1 +
 gcc/testsuite/gcc.dg/cpp/expr.c | 22 --
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/libcpp/expr.cc b/libcpp/expr.cc
index 825d2c2369d..4f4a9722ac7 100644
--- a/libcpp/expr.cc
+++ b/libcpp/expr.cc
@@ -2216,6 +2216,7 @@ num_div_op (cpp_reader *pfile, cpp_num lhs, cpp_num rhs, 
enum cpp_ttype op,
   if (!pfile->state.skip_eval)
cpp_error_with_line (pfile, CPP_DL_ERROR, location, 0,
 "division by zero in #if");
+  lhs.unsignedp = unsignedp;
   return lhs;
 }
 
diff --git a/gcc/testsuite/gcc.dg/cpp/expr.c b/gcc/testsuite/gcc.dg/cpp/expr.c
index 532bd681237..055e17ae753 100644
--- a/gcc/testsuite/gcc.dg/cpp/expr.c
+++ b/gcc/testsuite/gcc.dg/cpp/expr.c
@@ -1,6 +1,7 @@
 /* Copyright (C) 2000, 2001 Free Software Foundation, Inc.  */
 
 /* { dg-do preprocess } */
+/* { dg-additional-options "-Wall" } */
 
 /* Test we get signedness of ?: operator correct.  We would skip
evaluation of one argument, and might therefore not transfer its
@@ -8,10 +9,27 @@
 
 /* Neil Booth, 19 Jul 2002.  */
 
-#if (1 ? -2: 0 + 1U) < 0
+#if (1 ? -2: 0 + 1U) < 0 /* { dg-warning {the left operand of ":" changes 
sign} } */
 #error /* { dg-bogus "error" } */
 #endif
 
-#if (0 ? 0 + 1U: -2) < 0
+#if (0 ? 0 + 1U: -2) < 0 /* { dg-warning {the right operand of ":" changes 
sign} } */
 #error /* { dg-bogus "error" } */
 #endif
+
+/* PR preprocessor/112701 */
+#if (0 ? 0/0u : -1) < 0 /* { dg-warning {the right operand of ":" changes 
sign} } */
+#error /* { dg-bogus "error" } */
+#endif
+
+#if (0 ? 0u/0 : -1) < 0 /* { dg-warning {the right operand of ":" changes 
sign} } */
+#error /* { dg-bogus "error" } */
+#endif
+
+#if (1 ? -1 : 0/0u) < 0 /* { dg-warning {the left operand of ":" changes sign} 
} */
+#error /* { dg-bogus "error" } */
+#endif
+
+#if (1 ? -1 : 0u/0) < 0 /* { dg-warning {the left operand of ":" changes sign} 
} */
+#error /* { dg-bogus "error" } */
+#endif


[PATCH] Makefile.tpl: Avoid race condition in generating site.exp from the top level

2023-11-17 Thread Lewis Hyatt
Hello-

I often find it convenient to run a new c-c++-common test from the
main build dir like:

$ make -j 2 RUNTESTFLAGS=dg.exp=new-test.c check-gcc-{c,c++}

I noticed that sometimes this produces a corrupted site.exp and then no
tests work until it is remade manually. To avoid the issue, it is necessary
to do "cd gcc; make site.exp" before running a parallel make from the top
level directory. The below patch fixes it by just making that dependency on
site.exp explicit in the top level Makefile. Is it OK please? Thanks...

-Lewis

-- >8 --

A command like "make -j 2 check-gcc-c check-gcc-c++" run in the top level of
a fresh build directory does not work reliably. That will spawn two
independent make processes inside the "gcc" directory, and each of those
will attempt to create site.exp if it doesn't exist and will interfere with
each other, producing often a corrupted or empty site.exp. Resolve that by
making these targets depend on a new phony target which makes sure site.exp
is created first before starting the recursive makes.

ChangeLog:

* Makefile.in: Regenerate.
* Makefile.tpl: Add dependency on site.exp to check-gcc-* targets
---
 Makefile.in  | 30 +++---
 Makefile.tpl | 10 +-
 2 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/Makefile.tpl b/Makefile.tpl
index 8b7783bb4f1..6e22adecd2f 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -1639,9 +1639,17 @@ cross: all-build all-gas all-ld
 @endif gcc-no-bootstrap
 
 @if gcc
+
+.PHONY: gcc-site.exp
+gcc-site.exp:
+   r=`${PWD_COMMAND}`; export r; \
+   s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
+   $(HOST_EXPORTS) \
+   (cd gcc && $(MAKE) $(GCC_FLAGS_TO_PASS) site.exp);
+
 [+ FOR languages +]
 .PHONY: check-gcc-[+language+] check-[+language+]
-check-gcc-[+language+]:
+check-gcc-[+language+]: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
diff --git a/Makefile.in b/Makefile.in
index b65ab4953bc..da2344b3f3d 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -62200,8 +62200,16 @@ cross: all-build all-gas all-ld
 
 @if gcc
 
+.PHONY: gcc-site.exp
+gcc-site.exp:
+   r=`${PWD_COMMAND}`; export r; \
+   s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
+   $(HOST_EXPORTS) \
+   (cd gcc && $(MAKE) $(GCC_FLAGS_TO_PASS) site.exp);
+
+
 .PHONY: check-gcc-c check-c
-check-gcc-c:
+check-gcc-c: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62209,7 +62217,7 @@ check-gcc-c:
 check-c: check-gcc-c
 
 .PHONY: check-gcc-c++ check-c++
-check-gcc-c++:
+check-gcc-c++: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62217,7 +62225,7 @@ check-gcc-c++:
 check-c++: check-gcc-c++ check-target-libstdc++-v3 check-target-libitm-c++ 
check-target-libgomp-c++
 
 .PHONY: check-gcc-fortran check-fortran
-check-gcc-fortran:
+check-gcc-fortran: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62225,7 +62233,7 @@ check-gcc-fortran:
 check-fortran: check-gcc-fortran check-target-libquadmath 
check-target-libgfortran check-target-libgomp-fortran
 
 .PHONY: check-gcc-ada check-ada
-check-gcc-ada:
+check-gcc-ada: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62233,7 +62241,7 @@ check-gcc-ada:
 check-ada: check-gcc-ada check-target-libada
 
 .PHONY: check-gcc-objc check-objc
-check-gcc-objc:
+check-gcc-objc: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62241,7 +62249,7 @@ check-gcc-objc:
 check-objc: check-gcc-objc check-target-libobjc
 
 .PHONY: check-gcc-obj-c++ check-obj-c++
-check-gcc-obj-c++:
+check-gcc-obj-c++: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62249,7 +62257,7 @@ check-gcc-obj-c++:
 check-obj-c++: check-gcc-obj-c++
 
 .PHONY: check-gcc-go check-go
-check-gcc-go:
+check-gcc-go: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62257,7 +62265,7 @@ check-gcc-go:
 check-go: check-gcc-go check-target-libgo check-gotools
 
 .PHONY: check-gcc-m2 check-m2
-check-gcc-m2:
+check-gcc-m2: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62265,7 +62273,7 @@ check-gcc-m2:
 check-m2: check-gcc-m2 check-target-libgm2
 
 .PHONY: check-gcc-d check-d
-check-gcc-d:
+check-gcc-d: gcc-site.exp
r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS) \
@@ -62273,7 

ping: [PATCH] preprocessor: Reinitialize frontend parser after loading a PCH [PR112319]

2023-11-17 Thread Lewis Hyatt
May I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/634931.html

On Wed, Nov 1, 2023 at 5:55 PM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112319
>
> This is a one-line patch to fix the GCC 14 regression noted in the
> PR. Bootstrap + regtest all languages on x86-64 looks good. Is it OK please?
> Thanks!
>
> -Lewis
>
> -- >8 --
>
> Since r14-2893, the frontend parser object needs to exist when running in
> preprocess-only mode, because pragma_lex() is now called in that mode and
> needs to make use of it. This is handled by calling c_init_preprocess() at
> startup. If -fpch-preprocess is in effect (commonly, because of
> -save-temps), a PCH file may be loaded during preprocessing, in which
> case the parser will be destroyed, causing the issue noted in the
> PR. Resolve it by reinitializing the frontend parser after loading the PCH.
>
> gcc/c-family/ChangeLog:
>
> PR pch/112319
> * c-ppoutput.cc (cb_read_pch): Reinitialize the frontend parser
> after loading a PCH.
>
> gcc/testsuite/ChangeLog:
>
> PR pch/112319
> * g++.dg/pch/pr112319.C: New test.
> * g++.dg/pch/pr112319.Hs: New test.
> * gcc.dg/pch/pr112319.c: New test.
> * gcc.dg/pch/pr112319.hs: New test.
> ---
>  gcc/c-family/c-ppoutput.cc   | 5 +
>  gcc/testsuite/g++.dg/pch/pr112319.C  | 5 +
>  gcc/testsuite/g++.dg/pch/pr112319.Hs | 1 +
>  gcc/testsuite/gcc.dg/pch/pr112319.c  | 5 +
>  gcc/testsuite/gcc.dg/pch/pr112319.hs | 1 +
>  5 files changed, 17 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.Hs
>  create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.c
>  create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.hs
>
> diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
> index 4aa2bef2c0f..4f973767976 100644
> --- a/gcc/c-family/c-ppoutput.cc
> +++ b/gcc/c-family/c-ppoutput.cc
> @@ -862,4 +862,9 @@ cb_read_pch (cpp_reader *pfile, const char *name,
>
>fprintf (print.outf, "#pragma GCC pch_preprocess \"%s\"\n", name);
>print.src_line++;
> +
> +  /* The process of reading the PCH has destroyed the frontend parser,
> + so ask the frontend to reinitialize it, in case we need it to
> + process any #pragma directives encountered while preprocessing.  */
> +  c_init_preprocess ();
>  }
> diff --git a/gcc/testsuite/g++.dg/pch/pr112319.C 
> b/gcc/testsuite/g++.dg/pch/pr112319.C
> new file mode 100644
> index 000..9e0457e8aec
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/pr112319.C
> @@ -0,0 +1,5 @@
> +/* { dg-additional-options "-Wpragmas -save-temps" } */
> +#include "pr112319.H"
> +#pragma GCC diagnostic error "-Wpragmas"
> +#pragma GCC diagnostic ignored "oops" /* { dg-error "oops" } */
> +/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
> diff --git a/gcc/testsuite/g++.dg/pch/pr112319.Hs 
> b/gcc/testsuite/g++.dg/pch/pr112319.Hs
> new file mode 100644
> index 000..3b6178bfae0
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/pr112319.Hs
> @@ -0,0 +1 @@
> +/* This space intentionally left blank.  */
> diff --git a/gcc/testsuite/gcc.dg/pch/pr112319.c 
> b/gcc/testsuite/gcc.dg/pch/pr112319.c
> new file mode 100644
> index 000..043881463c5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pch/pr112319.c
> @@ -0,0 +1,5 @@
> +/* { dg-additional-options "-Wpragmas -save-temps" } */
> +#include "pr112319.h"
> +#pragma GCC diagnostic error "-Wpragmas"
> +#pragma GCC diagnostic ignored "oops" /* { dg-error "oops" } */
> +/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
> diff --git a/gcc/testsuite/gcc.dg/pch/pr112319.hs 
> b/gcc/testsuite/gcc.dg/pch/pr112319.hs
> new file mode 100644
> index 000..3b6178bfae0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pch/pr112319.hs
> @@ -0,0 +1 @@
> +/* This space intentionally left blank.  */


[PATCH] c-family: Let libcpp know when the compilation is for a PCH [PR9471]

2023-11-10 Thread Lewis Hyatt
Hello-

The PR may be 20 years old, but by now it only needs a one-line fix :). Is
it OK please? Bootstrapped + regtested all langauges on x86-64 Linux.
Thanks!

-Lewis

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47857

-- >8 --

libcpp will generate diagnostics when it encounters things in the main file
that only belong in a header file, such as `#pragma once' or `#pragma GCC
system_header'. But sometimes the main file is a header file that is just
being compiled separately, e.g. to produce a C++ module or a PCH, in which
case such diagnostics should be suppressed. libcpp already has an interface
to request that, so make use of it in the C frontends to prevent libcpp from
issuing unwanted diagnostics when compiling a PCH.

gcc/c-family/ChangeLog:

PR pch/9471
PR pch/47857
* c-opts.cc (c_common_post_options): Set cpp_opts->main_search
so libcpp knows it is compiling a header file separately.

gcc/testsuite/ChangeLog:

PR pch/9471
PR pch/47857
* g++.dg/pch/main-file-warnings.C: New test.
* g++.dg/pch/main-file-warnings.Hs: New test.
* gcc.dg/pch/main-file-warnings.c: New test.
* gcc.dg/pch/main-file-warnings.hs: New test.
---
 gcc/c-family/c-opts.cc | 3 +++
 gcc/testsuite/g++.dg/pch/main-file-warnings.C  | 7 +++
 gcc/testsuite/g++.dg/pch/main-file-warnings.Hs | 3 +++
 gcc/testsuite/gcc.dg/pch/main-file-warnings.c  | 7 +++
 gcc/testsuite/gcc.dg/pch/main-file-warnings.hs | 3 +++
 5 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pch/main-file-warnings.C
 create mode 100644 gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
 create mode 100644 gcc/testsuite/gcc.dg/pch/main-file-warnings.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/main-file-warnings.hs

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index fbabd1816c1..10403c03bd6 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1174,6 +1174,9 @@ c_common_post_options (const char **pfilename)
   "the %qs debug info cannot be used with "
   "pre-compiled headers",
   debug_set_names (write_symbols & ~DWARF2_DEBUG));
+ /* Let libcpp know that the main file is a header so it won't
+complain about things like #include_next and #pragma once.  */
+ cpp_opts->main_search = CMS_header;
}
   else if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
c_common_no_more_pch ();
diff --git a/gcc/testsuite/g++.dg/pch/main-file-warnings.C 
b/gcc/testsuite/g++.dg/pch/main-file-warnings.C
new file mode 100644
index 000..a9e8b0ba9f2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/main-file-warnings.C
@@ -0,0 +1,7 @@
+/* PR pch/9471 */
+/* PR pch/47857 */
+/* Test will fail if any warnings get issued while compiling the header into a 
PCH.  */
+#include "main-file-warnings.H"
+#pragma once /* { dg-warning "in main file" } */
+#pragma GCC system_header /* { dg-warning "outside include file" } */
+#include_next  /* { dg-warning "in primary source file" } */
diff --git a/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs 
b/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
new file mode 100644
index 000..d1582bb8290
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
@@ -0,0 +1,3 @@
+#pragma once /* { dg-bogus "in main file" } */
+#pragma GCC system_header /* { dg-bogus "outside include file" } */
+#include_next  /* { dg-bogus "in primary source file" } */
diff --git a/gcc/testsuite/gcc.dg/pch/main-file-warnings.c 
b/gcc/testsuite/gcc.dg/pch/main-file-warnings.c
new file mode 100644
index 000..aedbc15f7ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/main-file-warnings.c
@@ -0,0 +1,7 @@
+/* PR pch/9471 */
+/* PR pch/47857 */
+/* Test will fail if any warnings get issued while compiling the header into a 
PCH.  */
+#include "main-file-warnings.h"
+#pragma once /* { dg-warning "in main file" } */
+#pragma GCC system_header /* { dg-warning "outside include file" } */
+#include_next  /* { dg-warning "in primary source file" } */
diff --git a/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs 
b/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs
new file mode 100644
index 000..d1582bb8290
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs
@@ -0,0 +1,3 @@
+#pragma once /* { dg-bogus "in main file" } */
+#pragma GCC system_header /* { dg-bogus "outside include file" } */
+#include_next  /* { dg-bogus "in primary source file" } */


[PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2023-11-09 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918

This patch fixes the behavior of `#pragma GCC diagnostic pop' for permissive
error diagnostics such as -Wnarrowing (in C++11). Those currently do not
return to the correct state after the last pop; they become effectively
simple warnings instead. Bootstrap + regtest all languages on x86-64, does
it look OK please? Thanks!

-Lewis

-- >8 --

When a diagnostic pragma changes the classification of a given diagnostic,
the global options flags (such as warn_narrowing, etc.) may get changed too.
Specifically, if a warning was not enabled initially and was later enabled
by a pragma, then the corresponding global flag will change from false to
true when the pragma is processed. That change is permanent and is not
undone by a subsequent `#pragma GCC diagnostic pop'; the warning flag needs
to remain enabled since a diagnostic could be generated later on for a
source location prior to the pop.

So in order to support popping to the initial classification, given that the
global options flags no longer reflect that state, the diagnostic_context
object itself remembers the way things were before it changed anything. The
current implementation works fine for diagnostics that are always errors or
always warnings, but it doesn't do the right thing for diagnostics that
could be either, such as -Wnarrowing. The classification of that diagnostic
(or any permerror diagnostic) depends on the state of -fpermissive; for the
particular case of -Wnarrowing it also matters whether a compile-time or
run-time narrowing is being diagnosed.

The problem is that the current implementation insists on recording whether
an enabled diagnostic should be a DK_WARNING or a DK_ERROR, and then, after
popping to the initial state, it overrides it always to that type only. Fix
that up by adding a new internal diagnostic type DK_ANY. This just indicates
that the diagnostic is enabled without mandating exactly what type of
diagnostic it should be. Then the diagnostic can be emitted with whatever
type the frontend asks for.

Incidentally, while making this change, I noticed that classify_diagnostic()
spends some time computing a return value (the old classification kind) that
is not used anywhere. The computed value seems to have some problems, mainly
that it does not take into account `#pragma GCC diagnostic pop' at all, and
so the returned value doesn't seem like it could make sense in many
contexts. Given it would also not be desirable to leak the new internal-only
DK_ANY type to outside callers, I think it would make sense in a subsequent
cleanup patch to remove the return value altogether.

gcc/ChangeLog:

PR c++/111918
* diagnostic-core.h (enum diagnostic_t): Add DK_ANY special flag.
* diagnostic.cc (diagnostic_option_classifier::classify_diagnostic):
Make use of DK_ANY to indicate a diagnostic was initially enabled.
(diagnostic_context::diagnostic_enabled): Do not change the type of
a diagnostic if the saved classification is type DK_ANY.

gcc/testsuite/ChangeLog:

PR c++/111918
* g++.dg/cpp0x/Wnarrowing21a.C: New test.
* g++.dg/cpp0x/Wnarrowing21b.C: New test.
* g++.dg/cpp0x/Wnarrowing21c.C: New test.
* g++.dg/cpp0x/Wnarrowing21d.C: New test.
---
 gcc/diagnostic-core.h  |  5 -
 gcc/diagnostic.cc  | 13 ++---
 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21a.C | 14 ++
 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21b.C |  9 +
 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21c.C |  9 +
 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21d.C |  9 +
 6 files changed, 55 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21a.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21b.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21c.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/Wnarrowing21d.C

diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 04eba3d140e..4926c48da96 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -33,7 +33,10 @@ typedef enum
   DK_LAST_DIAGNOSTIC_KIND,
   /* This is used for tagging pragma pops in the diagnostic
  classification history chain.  */
-  DK_POP
+  DK_POP,
+  /* This is used internally to note that a diagnostic is enabled
+ without mandating any specific type.  */
+  DK_ANY,
 } diagnostic_t;
 
 /* RAII-style class for grouping related diagnostics.  */
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index addd6606eaa..99921a10b7b 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1126,8 +1126,7 @@ classify_diagnostic (const diagnostic_context *context,
  old_kind = !context->m_option_enabled (option_index,
 context->m_lang_mask,
 context->m_option_state)
-   ? DK_IGNORED : 

[PATCH] diagnostics: pch: Remember diagnostic pragmas in a PCH [PR64117]

2023-11-07 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117

This fixes an old PR / enhancement request. Bootstrap + regtest all
languages on x86-64 Linux. Please let me know if it looks OK? Thanks!

-Lewis

-- >8 --

As the PR points out, we do not currently record in a PCH whether any
diagnostics were reclassified by pragmas, so a header that takes care to
adjust some diagnostics for downstream users does not work as designed when
it is precompiled.  Implement that feature by adding a new interface
diagnostic_context::save_to_pch() / restore_from_pch(), which is called on
the global diagnostic context object.

This feature also exposes the need for a small tweak to the C++ frontend.
That frontend processes `#pragma GCC diagnostic push' and `#pragma GCC
diagnostic pop' pragmas twice, once while preprocessing and once again while
compiling.  In case a translation unit contains an unbalanced set of pushes
and pops, that results in twice as many leftover states as there should be.
This does not cause any problems for a single translation unit, but if the
snapshot of the state is preserved in a PCH, then it becomes observable, for
example with a setup like in the new testcase pragma-diagnostic-3.C:

t.h:

 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored...
 //no pop at end of the file

t.c

 #include "t.h"
 #pragma GCC diagnostic pop
 //expect to be at the initial state here

If t.h has been precompiled, and if the push had been processed twice at the
time the PCH was written, then the state will not be reset as expected in
t.c. Address that by having the C++ frontend reset the push/pop history
before starting its second pass.

gcc/ChangeLog:

PR pch/64117
* Makefile.in: Add tree-diagnostic.cc to GTFILES
* diagnostic.cc (diagnostic_option_classifier::init): Handle the
classification history members, which had been omitted here.
(diagnostic_option_classifier::fini): Likewise.
(diagnostic_option_classifier::classify_diagnostic): Refactor some
logic to...
(diagnostic_context::get_original_option_classification): ...this
new function.
* diagnostic.h (struct diagnostic_option_classifier::state): Declare.
(diagnostic_option_classifier::save_state_to): Declare.
(diagnostic_option_classifier::restore_from_pch): Declare.
(diagnostic_option_classifier::get_state): Declare.
(diagnostic_option_classifier::restore_state): Declare.
(diagnostic_option_classifier::free_state): Declare.
(diagnostic_context::get_original_option_classification): Declare.
(diagnostic_context::get_classifier): New accessor functions.
(diagnostic_context::save_to_pch): Declare.
(diagnostic_context::restore_from_pch): Declare.
* ggc-common.cc (gt_pch_save): Use the new PCH interface to handle
the global diagnostic context.
(gt_pch_restore): Likewise.
* tree-diagnostic.cc (struct diagnostic_option_classifier::state):
New struct.
(struct diagnostic_pch_data): New struct.
(pch): New GC root.
(diagnostic_context::save_to_pch): New function.
(diagnostic_context::restore_from_pch): New function.
(diagnostic_option_classifier::save_state_to): New function.
(diagnostic_option_classifier::restore_from_pch): New function.
(diagnostic_option_classifier::get_state): New function.
(diagnostic_option_classifier::restore_state): New function.
(diagnostic_option_classifier::free_state): New function.

gcc/cp/ChangeLog:

PR pch/64117
* parser.cc (cp_lexer_new_main): Restore the diagnostics
classifications to their original state after the first pass through
the input.

gcc/testsuite/ChangeLog:

PR pch/64117
* g++.dg/pch/pragma-diagnostic-1.C: New test.
* g++.dg/pch/pragma-diagnostic-1.Hs: New test.
* g++.dg/pch/pragma-diagnostic-2.C: New test.
* g++.dg/pch/pragma-diagnostic-2.Hs: New test.
* g++.dg/pch/pragma-diagnostic-3.C: New test.
* g++.dg/pch/pragma-diagnostic-3.Hs: New test.
* g++.dg/pch/pragma-diagnostic-4.C: New test.
* g++.dg/pch/pragma-diagnostic-4.Hs: New test.
* g++.dg/pch/pragma-diagnostic-5.C: New test.
* g++.dg/pch/pragma-diagnostic-5.Hs: New test.
* gcc.dg/pch/pragma-diagnostic-1.c: New test.
* gcc.dg/pch/pragma-diagnostic-1.hs: New test.
* gcc.dg/pch/pragma-diagnostic-2.c: New test.
* gcc.dg/pch/pragma-diagnostic-2.hs: New test.
* gcc.dg/pch/pragma-diagnostic-3.c: New test.
* gcc.dg/pch/pragma-diagnostic-3.hs: New test.
* gcc.dg/pch/pragma-diagnostic-4.c: New test.
* gcc.dg/pch/pragma-diagnostic-4.hs: New test.
---
 gcc/Makefile.in   |   1 +
 gcc/cp/parser.cc  |  11 ++
 gcc/diagnostic.cc |  26 ++-
 

Re: [PATCH 1/2] libdiagnostics: header and examples

2023-11-07 Thread Lewis Hyatt
On Mon, Nov 6, 2023 at 8:29 PM David Malcolm  wrote:
>
> Here's a work-in-progress patch for GCC that adds a libdiagnostics.h
> header describing the public interface, along with various testcases
> that show usage examples for the API.  Various aspects of this need
> work; posting now for early feedback on overall direction.
>
> How does the interface look?
>
...
> +typedef unsigned int diagnostic_location_t;

One comment that occurred to me... for GCC we have a lot of PRs that
are unhappy about the 32-bit location_t and the consequent issues that
arise with very large source files, or with very long lines that lose
column information.
So far GCC has been able to get by with "don't do that" advice, but a
more general libdiagnostics may need to avoid that arbitrary
limitation? I feel like it may not be that long before GCC needs to
deal with it as well, perhaps with a configure option, but even now,
it could make sense for libdiagnostic to use a 64-bit location_t
itself from the outset, so it won't need to change later, even if it's
practically restricted to 32 bits for now.

-Lewis


[PATCH] preprocessor: Reinitialize frontend parser after loading a PCH [PR112319]

2023-11-01 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112319

This is a one-line patch to fix the GCC 14 regression noted in the
PR. Bootstrap + regtest all languages on x86-64 looks good. Is it OK please?
Thanks!

-Lewis

-- >8 --

Since r14-2893, the frontend parser object needs to exist when running in
preprocess-only mode, because pragma_lex() is now called in that mode and
needs to make use of it. This is handled by calling c_init_preprocess() at
startup. If -fpch-preprocess is in effect (commonly, because of
-save-temps), a PCH file may be loaded during preprocessing, in which
case the parser will be destroyed, causing the issue noted in the
PR. Resolve it by reinitializing the frontend parser after loading the PCH.

gcc/c-family/ChangeLog:

PR pch/112319
* c-ppoutput.cc (cb_read_pch): Reinitialize the frontend parser
after loading a PCH.

gcc/testsuite/ChangeLog:

PR pch/112319
* g++.dg/pch/pr112319.C: New test.
* g++.dg/pch/pr112319.Hs: New test.
* gcc.dg/pch/pr112319.c: New test.
* gcc.dg/pch/pr112319.hs: New test.
---
 gcc/c-family/c-ppoutput.cc   | 5 +
 gcc/testsuite/g++.dg/pch/pr112319.C  | 5 +
 gcc/testsuite/g++.dg/pch/pr112319.Hs | 1 +
 gcc/testsuite/gcc.dg/pch/pr112319.c  | 5 +
 gcc/testsuite/gcc.dg/pch/pr112319.hs | 1 +
 5 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.C
 create mode 100644 gcc/testsuite/g++.dg/pch/pr112319.Hs
 create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/pr112319.hs

diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index 4aa2bef2c0f..4f973767976 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -862,4 +862,9 @@ cb_read_pch (cpp_reader *pfile, const char *name,
 
   fprintf (print.outf, "#pragma GCC pch_preprocess \"%s\"\n", name);
   print.src_line++;
+
+  /* The process of reading the PCH has destroyed the frontend parser,
+ so ask the frontend to reinitialize it, in case we need it to
+ process any #pragma directives encountered while preprocessing.  */
+  c_init_preprocess ();
 }
diff --git a/gcc/testsuite/g++.dg/pch/pr112319.C 
b/gcc/testsuite/g++.dg/pch/pr112319.C
new file mode 100644
index 000..9e0457e8aec
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/pr112319.C
@@ -0,0 +1,5 @@
+/* { dg-additional-options "-Wpragmas -save-temps" } */
+#include "pr112319.H"
+#pragma GCC diagnostic error "-Wpragmas"
+#pragma GCC diagnostic ignored "oops" /* { dg-error "oops" } */
+/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
diff --git a/gcc/testsuite/g++.dg/pch/pr112319.Hs 
b/gcc/testsuite/g++.dg/pch/pr112319.Hs
new file mode 100644
index 000..3b6178bfae0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/pr112319.Hs
@@ -0,0 +1 @@
+/* This space intentionally left blank.  */
diff --git a/gcc/testsuite/gcc.dg/pch/pr112319.c 
b/gcc/testsuite/gcc.dg/pch/pr112319.c
new file mode 100644
index 000..043881463c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/pr112319.c
@@ -0,0 +1,5 @@
+/* { dg-additional-options "-Wpragmas -save-temps" } */
+#include "pr112319.h"
+#pragma GCC diagnostic error "-Wpragmas"
+#pragma GCC diagnostic ignored "oops" /* { dg-error "oops" } */
+/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
diff --git a/gcc/testsuite/gcc.dg/pch/pr112319.hs 
b/gcc/testsuite/gcc.dg/pch/pr112319.hs
new file mode 100644
index 000..3b6178bfae0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/pr112319.hs
@@ -0,0 +1 @@
+/* This space intentionally left blank.  */


Re: [PATCH] libcpp: Improve the diagnostic for poisoned identifiers [PR36887]

2023-10-26 Thread Lewis Hyatt
On Thu, Oct 26, 2023 at 12:48 PM Christophe Lyon
 wrote:
>
> On Thu, 26 Oct 2023 at 18:18, Lewis Hyatt  wrote:
> >
> > On Thu, Oct 26, 2023 at 4:49 AM Christophe Lyon
> >  wrote:
> > > We have noticed that the new tests fail on aarch64 with:
> > > .../aarch64-unknown-linux-gnu/libc/usr/lib/crt1.o: in function `_start':
> > > .../sysdeps/aarch64/start.S:110:(.text+0x38): undefined reference to 
> > > `main'
> > >
> > > Looking at the test, I'd say it lacks a dg-do compile (to avoid
> > > linking), but how does it work on other targets?
> >
> > Thanks for pointing it out. I am definitely under the impression that
> > { dg-do compile } is the default and doesn't need to be specified, I
> > have never seen it not be the case before... Is that just not correct?
> > I tried it out on the cfarm (gcc185) for aarch64-redhat-linux and it
> > works for me there too, I tried the test individually and also as part
> > of the whole check-gcc-c++ target.
> >
> > I do see that there are target-dependent functions in
> > testsuite/lib/*.exp that will change dg-do-what-default under some
> > circumstances... but I also see in dg-pch.exp (which is the one
> > relevant for this test g++.dg/pch/pr36887.C) that dg-do-what-default
> > is set to compile explicitly.
>
> Indeed, thanks for checking.
>
> > Note sure what the best next step is, should I just add { dg-do
> > compile } since it's harmless in any case, or is there something else
> > worth looking into here? I'm not sure why I couldn't reproduce the
> > issue on the compile farm machine either, maybe you wouldn't mind
> > please check if adding this line fixes it for you anyway? Thanks...
>
> Can you share the compile line for this test in g++.log?
>

Sure, here is what I got on aarch64 for

make RUNTESTFLAGS=pch.exp=pr36887.C check-gcc-c++

For making the PCH:

xg++ -B/dev/shm/lhyatt/build/gcc/testsuite/g++/../../ ./pr36887.H
-fdiagnostics-plain-output -nostdinc++
-I/dev/shm/lhyatt/build/aarch64-unknown-linux-gnu/libstdc++-v3/include/aarch64-unknown-linux-gnu
-I/dev/shm/lhyatt/build/aarch64-unknown-linux-gnu/libstdc++-v3/include
-I/dev/shm/lhyatt/src/libstdc++-v3/libsupc++
-I/dev/shm/lhyatt/src/libstdc++-v3/include/backward
-I/dev/shm/lhyatt/src/libstdc++-v3/testsuite/util -fmessage-length=0
-g -o pr36887.H.gch

For compiling the test:

xg++ -B/dev/shm/lhyatt/build/gcc/testsuite/g++/../../
/dev/shm/lhyatt/src/gcc/testsuite/g++.dg/pch/pr36887.C
-fdiagnostics-plain-output -nostdinc++
-I/dev/shm/lhyatt/build/aarch64-unknown-linux-gnu/libstdc++-v3/include/aarch64-unknown-linux-gnu
-I/dev/shm/lhyatt/build/aarch64-unknown-linux-gnu/libstdc++-v3/include
-I/dev/shm/lhyatt/src/libstdc++-v3/libsupc++
-I/dev/shm/lhyatt/src/libstdc++-v3/include/backward
-I/dev/shm/lhyatt/src/libstdc++-v3/testsuite/util -fmessage-length=0
-g -I. -Dwith_PCH -S -o pr36887.s

(and then it repeats with -O2 added, or with -g removed as well)

> Actually I'm seeing several similar errors in our g++.log, not
> reported before because they were "pre-existing" failures.
> So something is confusing the testsuite and puts it into link mode.
>
> I am currently building from scratch, without our CI scripts to get
> some additional logs in a setup that probably matches yours. Then I
> should be able to add more traces a dejagnu level to understand what's
> happening.
>
> Thanks,
>
> Christophe


Re: [PATCH] libcpp: Improve the diagnostic for poisoned identifiers [PR36887]

2023-10-26 Thread Lewis Hyatt
On Thu, Oct 26, 2023 at 4:49 AM Christophe Lyon
 wrote:
> We have noticed that the new tests fail on aarch64 with:
> .../aarch64-unknown-linux-gnu/libc/usr/lib/crt1.o: in function `_start':
> .../sysdeps/aarch64/start.S:110:(.text+0x38): undefined reference to `main'
>
> Looking at the test, I'd say it lacks a dg-do compile (to avoid
> linking), but how does it work on other targets?

Thanks for pointing it out. I am definitely under the impression that
{ dg-do compile } is the default and doesn't need to be specified, I
have never seen it not be the case before... Is that just not correct?
I tried it out on the cfarm (gcc185) for aarch64-redhat-linux and it
works for me there too, I tried the test individually and also as part
of the whole check-gcc-c++ target.

I do see that there are target-dependent functions in
testsuite/lib/*.exp that will change dg-do-what-default under some
circumstances... but I also see in dg-pch.exp (which is the one
relevant for this test g++.dg/pch/pr36887.C) that dg-do-what-default
is set to compile explicitly.

Note sure what the best next step is, should I just add { dg-do
compile } since it's harmless in any case, or is there something else
worth looking into here? I'm not sure why I couldn't reproduce the
issue on the compile farm machine either, maybe you wouldn't mind
please check if adding this line fixes it for you anyway? Thanks...

-Lewis


ping: [PATCH] libcpp: Improve the diagnostic for poisoned identifiers [PR36887]

2023-10-20 Thread Lewis Hyatt
Hello-

May I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630967.html

-Lewis

On Wed, Sep 20, 2023 at 12:12 AM Lewis Hyatt  wrote:
>
> Hello-
>
> This patch implements the PR's request to add more information to the
> diagnostic issued for using a poisoned identifier. Bootstrapped + regtested
> all languages on x86-64 Linux. Does it look OK please? Thanks!
>
> -Lewis
>
> -- >8 --
>
> The PR requests an enhancement to the diagnostic issued for the use of a
> poisoned identifier. Currently, we show the location of the usage, but not
> the location which requested the poisoning, which would be helpful for the
> user if the decision to poison an identifier was made externally, such as
> in a library header.
>
> In order to output this information, we need to remember a location_t for
> each identifier that has been poisoned, and that data needs to be preserved
> as well in a PCH. One option would be to add a field to struct cpp_hashnode,
> but there is no convenient place to add it without increasing the size of
> the struct for all identifiers. Given this facility will be needed rarely,
> it seemed better to add a second hash map, which is handled PCH-wise the
> same as the current one in gcc/stringpool.cc. This hash map associates a new
> struct cpp_hashnode_extra with each identifier that needs one. Currently
> that struct only contains the new location_t, but it could be extended in
> the future if there is other ancillary data that may be convenient to put
> there for other purposes.
>
> libcpp/ChangeLog:
>
> PR preprocessor/36887
> * directives.cc (do_pragma_poison): Store in the extra hash map the
> location from which an identifier has been poisoned.
> * lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
> for the use of a poisoned identifier, also add a note indicating the
> location from which it was poisoned.
> * identifiers.cc (alloc_node): Convert to template function.
> (_cpp_init_hashtable): Handle the new extra hash map.
> (_cpp_destroy_hashtable): Likewise.
> * include/cpplib.h (struct cpp_hashnode_extra): New struct.
> (cpp_create_reader): Update prototype to...
> * init.cc (cpp_create_reader): ...accept an argument for the extra
> hash table and pass it to _cpp_init_hashtable.
> * include/symtab.h (ht_lookup): New overload for convenience.
> * internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
> (_cpp_init_hashtable): Adjust prototype.
>
> gcc/c-family/ChangeLog:
>
> PR preprocessor/36887
> * c-opts.cc (c_common_init_options): Pass new extra hash map
> argument to cpp_create_reader().
>
> gcc/ChangeLog:
>
> PR preprocessor/36887
> * toplev.h (ident_hash_extra): Declare...
> * stringpool.cc (ident_hash_extra): ...this new global variable.
> (init_stringpool): Handle ident_hash_extra as well as ident_hash.
> (ggc_mark_stringpool): Likewise.
> (ggc_purge_stringpool): Likewise.
> (struct string_pool_data_extra): New struct.
> (spd2): New GC root variable.
> (gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
> analogous to how spd is used to handle ident_hash.
> (gt_pch_restore_stringpool): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/36887
> * c-c++-common/cpp/diagnostic-poison.c: New test.
> * g++.dg/pch/pr36887.C: New test.
> * g++.dg/pch/pr36887.Hs: New test.
> ---
>  libcpp/directives.cc  |  3 ++
>  libcpp/identifiers.cc | 42 +++--
>  libcpp/include/cpplib.h   | 21 ++---
>  libcpp/include/symtab.h   |  6 +++
>  libcpp/init.cc|  4 +-
>  libcpp/internal.h |  8 +++-
>  libcpp/lex.cc | 10 -
>  gcc/c-family/c-opts.cc|  2 +-
>  gcc/stringpool.cc | 45 +++
>  gcc/toplev.h  |  3 +-
>  .../c-c++-common/cpp/diagnostic-poison.c  | 13 ++
>  gcc/testsuite/g++.dg/pch/pr36887.C|  3 ++
>  gcc/testsuite/g++.dg/pch/pr36887.Hs   |  1 +
>  13 files changed, 134 insertions(+), 27 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/diagnostic-poison.c
>  create mode 100644 gcc/testsuite/g++.dg/pch/pr36887.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/pr36887.Hs
>
> diff --git a/libcpp/directives.cc b/libcpp/directives.cc
&

Re: [PATCH] c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic [PR89038]

2023-10-19 Thread Lewis Hyatt
On Thu, Oct 19, 2023 at 8:43 AM Marek Polacek  wrote:
>
> On Wed, Oct 18, 2023 at 05:15:42PM -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > The PR points out that my fix for PR53431 was incomplete and did not handle
> > -Wunknown-pragmas. This is a one-line fix to correct that, is it OK for
> > trunk and for GCC 13 backport please? bootstrap + regtest all languages on
> > x86-64 Linux. Thanks!
>
> I think I can approve this, so, OK.  Thanks.
>

Great, thank you very much. Just to be safe, was that OK for the
backport as well?

-Lewis


[PATCH] c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic [PR89038]

2023-10-18 Thread Lewis Hyatt
Hello-

The PR points out that my fix for PR53431 was incomplete and did not handle
-Wunknown-pragmas. This is a one-line fix to correct that, is it OK for
trunk and for GCC 13 backport please? bootstrap + regtest all languages on
x86-64 Linux. Thanks!

-Lewis

-- >8 --

As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
the specific case of -Wunknown-pragmas, because that warning is issued
during preprocessing, but not by libcpp directly (it comes from the
cb_def_pragma callback).  Address that by handling this pragma in
addition to libcpp pragmas during the early pragma handler.

gcc/c-family/ChangeLog:

PR c++/89038
* c-pragma.cc (handle_pragma_diagnostic_impl):  Handle
-Wunknown-pragmas during early processing.

gcc/testsuite/ChangeLog:

PR c++/89038
* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.
---
 gcc/c-family/c-pragma.cc|  3 ++-
 gcc/testsuite/c-c++-common/cpp/Wunknown-pragmas-1.c | 13 +
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/Wunknown-pragmas-1.c

diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc
index 293311dd4ce..98dfb0f108b 100644
--- a/gcc/c-family/c-pragma.cc
+++ b/gcc/c-family/c-pragma.cc
@@ -963,7 +963,8 @@ handle_pragma_diagnostic_impl ()
   /* option_string + 1 to skip the initial '-' */
   unsigned int option_index = find_opt (data.option_str + 1, lang_mask);
 
-  if (early && !c_option_is_from_cpp_diagnostics (option_index))
+  if (early && !(c_option_is_from_cpp_diagnostics (option_index)
+|| option_index == OPT_Wunknown_pragmas))
 return;
 
   if (option_index == OPT_SPECIAL_unknown)
diff --git a/gcc/testsuite/c-c++-common/cpp/Wunknown-pragmas-1.c 
b/gcc/testsuite/c-c++-common/cpp/Wunknown-pragmas-1.c
new file mode 100644
index 000..fb58739e2bc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/Wunknown-pragmas-1.c
@@ -0,0 +1,13 @@
+/* PR c++/89038 */
+/* { dg-additional-options "-Wunknown-pragmas" } */
+
+#pragma oops /* { dg-warning "-:-Wunknown-pragmas" } */
+#pragma GGC diagnostic push /* { dg-warning "-:-Wunknown-pragmas" } */
+#pragma GCC diagnostics push /* { dg-warning "-:-Wunknown-pragmas" } */
+
+/* Test we can disable the warnings.  */
+#pragma GCC diagnostic ignored "-Wunknown-pragmas"
+
+#pragma oops /* { dg-bogus "-:-Wunknown-pragmas" } */
+#pragma GGC diagnostic push /* { dg-bogus "-:-Wunknown-pragmas" } */
+#pragma GCC diagnostics push /* { dg-bogus "-:-Wunknown-pragmas" } */


Re: [PATCH] libcpp: testsuite: Add test for fixed _Pragma bug [PR82335]

2023-10-18 Thread Lewis Hyatt
May I please ping this one, and/or, is it something straightforward
enough I can just commit it as obvious? Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631814.html

-Lewis

On Mon, Oct 2, 2023 at 6:23 PM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335 is another
> _Pragma-related bug that got fixed in GCC 12 but is still open. Before
> closing it out, I thought it would be good to add the testcase from that
> PR, which we don't have exactly in the testsuite already. Is it OK please?
> Thanks!
>
> -Lewis
>
> -- >8 --
>
> This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR
> that is not represented elsewhere.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/82335
> * c-c++-common/cpp/diagnostic-pragma-3.c: New test.
> ---
>  .../c-c++-common/cpp/diagnostic-pragma-3.c| 37 +++
>  1 file changed, 37 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c 
> b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c
> new file mode 100644
> index 000..459dcec73b3
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c
> @@ -0,0 +1,37 @@
> +/* This is like diagnostic-pragma-2.c, but handles the case where everything
> +   is wrapped inside a macro, which previously caused additional issues 
> tracked
> +   in PR preprocessor/82335.  */
> +
> +/* { dg-do compile } */
> +/* { dg-additional-options "-save-temps -Wattributes -Wtype-limits" } */
> +
> +#define B _Pragma("GCC diagnostic push") \
> +  _Pragma("GCC diagnostic ignored \"-Wattributes\"")
> +#define E _Pragma("GCC diagnostic pop")
> +
> +#define X() B int __attribute((unknown_attr)) x; E
> +#define Y   B int __attribute((unknown_attr)) y; E
> +#define WRAP(x) x
> +
> +void test1(void)
> +{
> +  WRAP(X())
> +  WRAP(Y)
> +}
> +
> +/* Additional test provided on the PR.  */
> +#define PRAGMA(...) _Pragma(#__VA_ARGS__)
> +#define PUSH_IGN(X) PRAGMA(GCC diagnostic push) PRAGMA(GCC diagnostic 
> ignored X)
> +#define POP() PRAGMA(GCC diagnostic pop)
> +#define TEST(X, Y) \
> +  PUSH_IGN("-Wtype-limits") \
> +  int Y = (__typeof(X))-1 < 0; \
> +  POP()
> +
> +int test2()
> +{
> +  unsigned x;
> +  TEST(x, i1);
> +  WRAP(TEST(x, i2))
> +  return i1 + i2;
> +}


ping: [PATCH] preprocessor: c++: Support `#pragma GCC target' macros [PR87299]

2023-10-13 Thread Lewis Hyatt
On Tue, Sep 12, 2023 at 04:09:21PM -0400, Lewis Hyatt wrote:
> On Tue, Aug 8, 2023 at 5:53 PM Jason Merrill  wrote:
> >
> > On 7/31/23 22:22, Lewis Hyatt via Gcc-patches wrote:
> > > `#pragma GCC target' is not currently handled in preprocess-only mode 
> > > (e.g.,
> > > when running gcc -E or gcc -save-temps). As noted in the PR, this means 
> > > that
> > > if the target pragma defines any macros, those macros are not effective in
> > > preprocess-only mode. Similarly, such macros are not effective when
> > > compiling with C++ (even when compiling without -save-temps), because C++
> > > does not process the pragma until after all tokens have been obtained from
> > > libcpp, at which point it is too late for macro expansion to take place.
> > >
> > > Since r13-1544 and r14-2893, there is a general mechanism to handle 
> > > pragmas
> > > under these conditions as well, so resolve the PR by using the new "early
> > > pragma" support.
> > >
> > > toplev.cc required some changes because the target-specific handlers for
> > > `#pragma GCC target' may call target_reinit(), and toplev.cc was not 
> > > expecting
> > > that function to be called in preprocess-only mode.
> > >
> > > I added some additional testcases from the PR for x86. The other targets
> > > that support `#pragma GCC target' (aarch64, arm, nios2, powerpc, s390)
> > > already had tests verifying that the pragma sets macros as expected; here 
> > > I
> > > have added -save-temps to some of them, to test that it now works in
> > > preprocess-only mode as well.
> > >
> > > gcc/c-family/ChangeLog:
> > >
> > >   PR preprocessor/87299
> > >   * c-pragma.cc (init_pragma): Register `#pragma GCC target' and
> > >   related pragmas in preprocess-only mode, and enable early handling.
> > >   (c_reset_target_pragmas): New function refactoring code from...
> > >   (handle_pragma_reset_options): ...here.
> > >   * c-pragma.h (c_reset_target_pragmas): Declare.
> > >
> > > gcc/cp/ChangeLog:
> > >
> > >   PR preprocessor/87299
> > >   * parser.cc (cp_lexer_new_main): Call c_reset_target_pragmas ()
> > >   after preprocessing is complete, before starting compilation.
> > >
> > > gcc/ChangeLog:
> > >
> > >   PR preprocessor/87299
> > >   * toplev.cc (no_backend): New static global.
> > >   (finalize): Remove argument no_backend, which is now a
> > >   static global.
> > >   (process_options): Likewise.
> > >   (do_compile): Likewise.
> > >   (target_reinit): Don't do anything in preprocess-only mode.
> > >   (toplev::main): Adapt to no_backend change.
> > >   (toplev::finalize): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   PR preprocessor/87299
> > >   * c-c++-common/pragma-target-1.c: New test.
> > >   * c-c++-common/pragma-target-2.c: New test.
> > >   * g++.target/i386/pr87299-1.C: New test.
> > >   * g++.target/i386/pr87299-2.C: New test.
> > >   * gcc.target/i386/pr87299-1.c: New test.
> > >   * gcc.target/i386/pr87299-2.c: New test.
> > >   * gcc.target/s390/target-attribute/tattr-2.c: Add -save-temps to the
> > >   options, to test preprocess-only mode as well.
> > >   * gcc.target/aarch64/pragma_cpp_predefs_1.c: Likewise.
> > >   * gcc.target/arm/pragma_arch_attribute.c: Likewise.
> > >   * gcc.target/nios2/custom-fp-2.c: Likewise.
> > >   * gcc.target/powerpc/float128-3.c: Likewise.
> > > ---
> > >
> > > Notes:
> > >  Hello-
> > >
> > >  This patch fixes the PR by enabling early pragma handling for 
> > > `#pragma GCC
> > >  target' and related pragmas such as `#pragma GCC push_options'. I 
> > > did not
> > >  need to touch any target-specific code, however I did need to make a 
> > > change
> > >  to toplev.cc, affecting all targets, to make it safe to call 
> > > target_reinit()
> > >  in preprocess-only mode. (Otherwise, it would be necessary to modify 
> > > the
> > >  implementation of target pragmas in every target, to avoid this code 
> > > path.)
> > >  That was the only complication I ran into.
> > >
> > >  Regarding testing, I did: (thanks to GCC compil

[PATCH] libcpp: testsuite: Add test for fixed _Pragma bug [PR82335]

2023-10-02 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335 is another
_Pragma-related bug that got fixed in GCC 12 but is still open. Before
closing it out, I thought it would be good to add the testcase from that
PR, which we don't have exactly in the testsuite already. Is it OK please?
Thanks!

-Lewis

-- >8 --

This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR
that is not represented elsewhere.

gcc/testsuite/ChangeLog:

PR preprocessor/82335
* c-c++-common/cpp/diagnostic-pragma-3.c: New test.
---
 .../c-c++-common/cpp/diagnostic-pragma-3.c| 37 +++
 1 file changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c

diff --git a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c 
b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c
new file mode 100644
index 000..459dcec73b3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-3.c
@@ -0,0 +1,37 @@
+/* This is like diagnostic-pragma-2.c, but handles the case where everything
+   is wrapped inside a macro, which previously caused additional issues tracked
+   in PR preprocessor/82335.  */
+
+/* { dg-do compile } */
+/* { dg-additional-options "-save-temps -Wattributes -Wtype-limits" } */
+
+#define B _Pragma("GCC diagnostic push") \
+  _Pragma("GCC diagnostic ignored \"-Wattributes\"")
+#define E _Pragma("GCC diagnostic pop")
+
+#define X() B int __attribute((unknown_attr)) x; E
+#define Y   B int __attribute((unknown_attr)) y; E
+#define WRAP(x) x
+
+void test1(void)
+{
+  WRAP(X())
+  WRAP(Y)
+}
+
+/* Additional test provided on the PR.  */
+#define PRAGMA(...) _Pragma(#__VA_ARGS__)
+#define PUSH_IGN(X) PRAGMA(GCC diagnostic push) PRAGMA(GCC diagnostic ignored 
X)
+#define POP() PRAGMA(GCC diagnostic pop)
+#define TEST(X, Y) \
+  PUSH_IGN("-Wtype-limits") \
+  int Y = (__typeof(X))-1 < 0; \
+  POP()
+
+int test2()
+{
+  unsigned x;
+  TEST(x, i1);
+  WRAP(TEST(x, i2))
+  return i1 + i2;
+}


[PATCH] libcpp: Improve the diagnostic for poisoned identifiers [PR36887]

2023-09-19 Thread Lewis Hyatt
Hello-

This patch implements the PR's request to add more information to the
diagnostic issued for using a poisoned identifier. Bootstrapped + regtested
all languages on x86-64 Linux. Does it look OK please? Thanks!

-Lewis

-- >8 --

The PR requests an enhancement to the diagnostic issued for the use of a
poisoned identifier. Currently, we show the location of the usage, but not
the location which requested the poisoning, which would be helpful for the
user if the decision to poison an identifier was made externally, such as
in a library header.

In order to output this information, we need to remember a location_t for
each identifier that has been poisoned, and that data needs to be preserved
as well in a PCH. One option would be to add a field to struct cpp_hashnode,
but there is no convenient place to add it without increasing the size of
the struct for all identifiers. Given this facility will be needed rarely,
it seemed better to add a second hash map, which is handled PCH-wise the
same as the current one in gcc/stringpool.cc. This hash map associates a new
struct cpp_hashnode_extra with each identifier that needs one. Currently
that struct only contains the new location_t, but it could be extended in
the future if there is other ancillary data that may be convenient to put
there for other purposes.

libcpp/ChangeLog:

PR preprocessor/36887
* directives.cc (do_pragma_poison): Store in the extra hash map the
location from which an identifier has been poisoned.
* lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
for the use of a poisoned identifier, also add a note indicating the
location from which it was poisoned.
* identifiers.cc (alloc_node): Convert to template function.
(_cpp_init_hashtable): Handle the new extra hash map.
(_cpp_destroy_hashtable): Likewise.
* include/cpplib.h (struct cpp_hashnode_extra): New struct.
(cpp_create_reader): Update prototype to...
* init.cc (cpp_create_reader): ...accept an argument for the extra
hash table and pass it to _cpp_init_hashtable.
* include/symtab.h (ht_lookup): New overload for convenience.
* internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
(_cpp_init_hashtable): Adjust prototype.

gcc/c-family/ChangeLog:

PR preprocessor/36887
* c-opts.cc (c_common_init_options): Pass new extra hash map
argument to cpp_create_reader().

gcc/ChangeLog:

PR preprocessor/36887
* toplev.h (ident_hash_extra): Declare...
* stringpool.cc (ident_hash_extra): ...this new global variable.
(init_stringpool): Handle ident_hash_extra as well as ident_hash.
(ggc_mark_stringpool): Likewise.
(ggc_purge_stringpool): Likewise.
(struct string_pool_data_extra): New struct.
(spd2): New GC root variable.
(gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
analogous to how spd is used to handle ident_hash.
(gt_pch_restore_stringpool): Likewise.

gcc/testsuite/ChangeLog:

PR preprocessor/36887
* c-c++-common/cpp/diagnostic-poison.c: New test.
* g++.dg/pch/pr36887.C: New test.
* g++.dg/pch/pr36887.Hs: New test.
---
 libcpp/directives.cc  |  3 ++
 libcpp/identifiers.cc | 42 +++--
 libcpp/include/cpplib.h   | 21 ++---
 libcpp/include/symtab.h   |  6 +++
 libcpp/init.cc|  4 +-
 libcpp/internal.h |  8 +++-
 libcpp/lex.cc | 10 -
 gcc/c-family/c-opts.cc|  2 +-
 gcc/stringpool.cc | 45 +++
 gcc/toplev.h  |  3 +-
 .../c-c++-common/cpp/diagnostic-poison.c  | 13 ++
 gcc/testsuite/g++.dg/pch/pr36887.C|  3 ++
 gcc/testsuite/g++.dg/pch/pr36887.Hs   |  1 +
 13 files changed, 134 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/diagnostic-poison.c
 create mode 100644 gcc/testsuite/g++.dg/pch/pr36887.C
 create mode 100644 gcc/testsuite/g++.dg/pch/pr36887.Hs

diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index ee5419d1f40..c5c938fda1d 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -1737,6 +1737,9 @@ do_pragma_poison (cpp_reader *pfile)
   NODE_NAME (hp));
   _cpp_free_definition (hp);
   hp->flags |= NODE_POISONED | NODE_DIAGNOSTIC;
+  const auto data = (cpp_hashnode_extra *)
+   ht_lookup (pfile->extra_hash_table, hp->ident, HT_ALLOC);
+  data->poisoned_loc = tok->src_loc;
 }
   pfile->state.poisoned_ok = 0;
 }
diff --git a/libcpp/identifiers.cc b/libcpp/identifiers.cc
index 7eccaa9bfd3..10cbbdf703d 100644
--- a/libcpp/identifiers.cc
+++ b/libcpp/identifiers.cc
@@ -27,24 +27,22 @@ 

Re: [PATCH] libcpp: Fix ICE on #include after a line marker directive [PR61474]

2023-09-19 Thread Lewis Hyatt
On Tue, Sep 19, 2023 at 1:13 PM Marek Polacek  wrote:
>
> On Tue, Sep 19, 2023 at 06:08:50PM +0100, Richard Sandiford wrote:
> > Lewis Hyatt via Gcc-patches  writes:
> > > Hello-
> > >
> > > This fixes an old PR, bootstrap + regtest on x86-64 Linux. Please let me 
> > > know if it's ok? Thanks!
> > >
> > > -Lewis
> > >
> > > -- >8 --
> > >
> > > As noted in the PR, GCC will segfault if a file name is first seen in a
> > > linemarker directive, and then later seen in a normal #include.  This is
> > > because the fake include process adds the file to the cache with a null 
> > > PATH
> > > member. The normal #include finds this file in the cache and then attempts
> > > to use the null PATH.  Resolve by adding the file to the cache with a 
> > > unique
> > > starting directory, so that the fake entry will only be found by a
> > > subsequent fake include, not by a real one.
> > >
> > > libcpp/ChangeLog:
> > >
> > > PR preprocessor/61474
> > > * files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
> > > include files.
> > > (_cpp_fake_include): Pass a unique cpp_dir* address so
> > > the fake file will not be found when looked up for real.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR preprocessor/61474
> > > * c-c++-common/cpp/pr61474-2.h: New test.
> > > * c-c++-common/cpp/pr61474.c: New test.
> > > * c-c++-common/cpp/pr61474.h: New test.
> >
> > Neat fix!  I don't know this code very well, but I agree it looks
> > correct.  OK if no-one objects in 24 hours.
>
> Looks fine to me too, thanks Lewis.

Thank you both, much appreciated. I will push it tomorrow evening then.

-Lewis


[PATCH] libcpp: Fix ICE on #include after a line marker directive [PR61474]

2023-09-15 Thread Lewis Hyatt via Gcc-patches
Hello-

This fixes an old PR, bootstrap + regtest on x86-64 Linux. Please let me know 
if it's ok? Thanks!

-Lewis

-- >8 --

As noted in the PR, GCC will segfault if a file name is first seen in a
linemarker directive, and then later seen in a normal #include.  This is
because the fake include process adds the file to the cache with a null PATH
member. The normal #include finds this file in the cache and then attempts
to use the null PATH.  Resolve by adding the file to the cache with a unique
starting directory, so that the fake entry will only be found by a
subsequent fake include, not by a real one.

libcpp/ChangeLog:

PR preprocessor/61474
* files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
include files.
(_cpp_fake_include): Pass a unique cpp_dir* address so
the fake file will not be found when looked up for real.

gcc/testsuite/ChangeLog:

PR preprocessor/61474
* c-c++-common/cpp/pr61474-2.h: New test.
* c-c++-common/cpp/pr61474.c: New test.
* c-c++-common/cpp/pr61474.h: New test.
---
 libcpp/files.cc| 11 +--
 gcc/testsuite/c-c++-common/cpp/pr61474-2.h |  1 +
 gcc/testsuite/c-c++-common/cpp/pr61474.c   |  5 +
 gcc/testsuite/c-c++-common/cpp/pr61474.h   |  6 ++
 4 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474-2.h
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.h

diff --git a/libcpp/files.cc b/libcpp/files.cc
index 43a8894b7de..27301d79fa4 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -541,7 +541,9 @@ _cpp_find_file (cpp_reader *pfile, const char *fname, 
cpp_dir *start_dir,
 = (kind == _cpp_FFK_PRE_INCLUDE
|| (pfile->buffer && pfile->buffer->file->implicit_preinclude));
 
-  if (kind != _cpp_FFK_FAKE)
+  if (kind == _cpp_FFK_FAKE)
+file->dont_read = true;
+  else
 /* Try each path in the include chain.  */
 for (;;)
   {
@@ -1490,7 +1492,12 @@ cpp_clear_file_cache (cpp_reader *pfile)
 void
 _cpp_fake_include (cpp_reader *pfile, const char *fname)
 {
-  _cpp_find_file (pfile, fname, pfile->buffer->file->dir, 0, _cpp_FFK_FAKE, 0);
+  /* It does not matter what are the contents of fake_source_dir, it will never
+ be inspected; we just use its address to uniquely signify that this file
+ was added as a fake include, so a later call to _cpp_find_file (to include
+ the file for real) won't find the fake one in the hash table.  */
+  static cpp_dir fake_source_dir;
+  _cpp_find_file (pfile, fname, _source_dir, 0, _cpp_FFK_FAKE, 0);
 }
 
 /* Not everyone who wants to set system-header-ness on a buffer can
diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474-2.h 
b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
new file mode 100644
index 000..6f70f09beec
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
@@ -0,0 +1 @@
+#pragma once
diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.c 
b/gcc/testsuite/c-c++-common/cpp/pr61474.c
new file mode 100644
index 000..f835a40fc7a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr61474.c
@@ -0,0 +1,5 @@
+/* { dg-do preprocess } */
+#include "pr61474.h"
+/* Make sure that the file can be included for real, after it was
+   fake-included by the linemarker directives in pr61474.h.  */
+#include "pr61474-2.h"
diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.h 
b/gcc/testsuite/c-c++-common/cpp/pr61474.h
new file mode 100644
index 000..d9e8c3a1fec
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr61474.h
@@ -0,0 +1,6 @@
+/* Create a fake include for pr61474-2.h and exercise looking it up.  */
+/* Use #pragma once to check also that the fake-include entry in the file
+   cache does not cause a problem in libcpp/files.cc:has_unique_contents().  */
+#pragma once
+# 1 "pr61474-2.h" 1
+# 2 "pr61474-2.h" 1


Re: [PATCH] preprocessor: c++: Support `#pragma GCC target' macros [PR87299]

2023-09-12 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 8, 2023 at 5:53 PM Jason Merrill  wrote:
>
> On 7/31/23 22:22, Lewis Hyatt via Gcc-patches wrote:
> > `#pragma GCC target' is not currently handled in preprocess-only mode (e.g.,
> > when running gcc -E or gcc -save-temps). As noted in the PR, this means that
> > if the target pragma defines any macros, those macros are not effective in
> > preprocess-only mode. Similarly, such macros are not effective when
> > compiling with C++ (even when compiling without -save-temps), because C++
> > does not process the pragma until after all tokens have been obtained from
> > libcpp, at which point it is too late for macro expansion to take place.
> >
> > Since r13-1544 and r14-2893, there is a general mechanism to handle pragmas
> > under these conditions as well, so resolve the PR by using the new "early
> > pragma" support.
> >
> > toplev.cc required some changes because the target-specific handlers for
> > `#pragma GCC target' may call target_reinit(), and toplev.cc was not 
> > expecting
> > that function to be called in preprocess-only mode.
> >
> > I added some additional testcases from the PR for x86. The other targets
> > that support `#pragma GCC target' (aarch64, arm, nios2, powerpc, s390)
> > already had tests verifying that the pragma sets macros as expected; here I
> > have added -save-temps to some of them, to test that it now works in
> > preprocess-only mode as well.
> >
> > gcc/c-family/ChangeLog:
> >
> >   PR preprocessor/87299
> >   * c-pragma.cc (init_pragma): Register `#pragma GCC target' and
> >   related pragmas in preprocess-only mode, and enable early handling.
> >   (c_reset_target_pragmas): New function refactoring code from...
> >   (handle_pragma_reset_options): ...here.
> >   * c-pragma.h (c_reset_target_pragmas): Declare.
> >
> > gcc/cp/ChangeLog:
> >
> >   PR preprocessor/87299
> >   * parser.cc (cp_lexer_new_main): Call c_reset_target_pragmas ()
> >   after preprocessing is complete, before starting compilation.
> >
> > gcc/ChangeLog:
> >
> >   PR preprocessor/87299
> >   * toplev.cc (no_backend): New static global.
> >   (finalize): Remove argument no_backend, which is now a
> >   static global.
> >   (process_options): Likewise.
> >   (do_compile): Likewise.
> >   (target_reinit): Don't do anything in preprocess-only mode.
> >   (toplev::main): Adapt to no_backend change.
> >   (toplev::finalize): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR preprocessor/87299
> >   * c-c++-common/pragma-target-1.c: New test.
> >   * c-c++-common/pragma-target-2.c: New test.
> >   * g++.target/i386/pr87299-1.C: New test.
> >   * g++.target/i386/pr87299-2.C: New test.
> >   * gcc.target/i386/pr87299-1.c: New test.
> >   * gcc.target/i386/pr87299-2.c: New test.
> >   * gcc.target/s390/target-attribute/tattr-2.c: Add -save-temps to the
> >   options, to test preprocess-only mode as well.
> >   * gcc.target/aarch64/pragma_cpp_predefs_1.c: Likewise.
> >   * gcc.target/arm/pragma_arch_attribute.c: Likewise.
> >   * gcc.target/nios2/custom-fp-2.c: Likewise.
> >   * gcc.target/powerpc/float128-3.c: Likewise.
> > ---
> >
> > Notes:
> >  Hello-
> >
> >  This patch fixes the PR by enabling early pragma handling for `#pragma 
> > GCC
> >  target' and related pragmas such as `#pragma GCC push_options'. I did 
> > not
> >  need to touch any target-specific code, however I did need to make a 
> > change
> >  to toplev.cc, affecting all targets, to make it safe to call 
> > target_reinit()
> >  in preprocess-only mode. (Otherwise, it would be necessary to modify 
> > the
> >  implementation of target pragmas in every target, to avoid this code 
> > path.)
> >  That was the only complication I ran into.
> >
> >  Regarding testing, I did: (thanks to GCC compile farm for the non-x86
> >  targets)
> >
> >  bootstrap + regtest all languages - x86_64-pc-linux-gnu
> >  bootstrap + regtest c/c++ - powerpc64le-unknown-linux-gnu,
> >  aarch64-unknown-linux-gnu
> >
> >  The following backends also implement this pragma so ought to be 
> > tested:
> >  arm
> >  nios2
> >  s390
> >
> >  I am not able to test those directly. I did add coverage to their 
> > testsuites
> &

Ping: [PATCH] testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]

2023-09-08 Thread Lewis Hyatt via Gcc-patches
Hello-

May I please ping this one? It's adding a testcase prior to closing
the PR. Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628488.html

-Lewis

On Fri, Aug 25, 2023 at 4:46 PM Lewis Hyatt  wrote:
>
> Hello-
>
> This is adding a testcase for a PR that was already incidentally fixed. OK
> to commit please? Thanks...
>
> -Lewis
>
> -- >8 --
>
> The PR was fixed by r12-5454. Since the fix was somewhat incidental,
> although related, add a testcase from PR90400 too before closing it out.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/90400
> * c-c++-common/cpp/pr90400.c: New test.
> ---
>  gcc/testsuite/c-c++-common/cpp/pr90400.c | 14 ++
>  1 file changed, 14 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr90400.c
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr90400.c 
> b/gcc/testsuite/c-c++-common/cpp/pr90400.c
> new file mode 100644
> index 000..4f2cab8d6ab
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr90400.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-save-temps" } */
> +/* PR preprocessor/90400 */
> +
> +#define OUTER(x) x
> +#define FOR(x) _Pragma ("GCC unroll 0") for (x)
> +void f ()
> +{
> +/* If the pragma were to be seen prior to the expansion of FOR, as was
> +   the case before r12-5454, then the unroll pragma would complain
> +   because the immediately following statement would be ";" rather than
> +   a loop.  */
> +OUTER (; FOR (int i = 0; i != 1; ++i);) /* { dg-bogus {statement 
> expected before ';' token} } */
> +}


[PATCH] testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]

2023-08-25 Thread Lewis Hyatt via Gcc-patches
Hello-

This is adding a testcase for a PR that was already incidentally fixed. OK
to commit please? Thanks...

-Lewis

-- >8 --

The PR was fixed by r12-5454. Since the fix was somewhat incidental,
although related, add a testcase from PR90400 too before closing it out.

gcc/testsuite/ChangeLog:

PR preprocessor/90400
* c-c++-common/cpp/pr90400.c: New test.
---
 gcc/testsuite/c-c++-common/cpp/pr90400.c | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr90400.c

diff --git a/gcc/testsuite/c-c++-common/cpp/pr90400.c 
b/gcc/testsuite/c-c++-common/cpp/pr90400.c
new file mode 100644
index 000..4f2cab8d6ab
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr90400.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-save-temps" } */
+/* PR preprocessor/90400 */
+
+#define OUTER(x) x
+#define FOR(x) _Pragma ("GCC unroll 0") for (x)
+void f ()
+{
+/* If the pragma were to be seen prior to the expansion of FOR, as was
+   the case before r12-5454, then the unroll pragma would complain
+   because the immediately following statement would be ";" rather than
+   a loop.  */
+OUTER (; FOR (int i = 0; i != 1; ++i);) /* { dg-bogus {statement expected 
before ';' token} } */
+}


Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-23 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 03:39:40PM -0400, David Malcolm wrote:
> On Tue, 2023-08-15 at 13:58 -0400, Lewis Hyatt wrote:
> > On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote:
> > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > > Class file_cache_slot in input.cc is used to query specific lines
> > > > of source
> > > > code from a file when needed by diagnostics infrastructure. This
> > > > will be
> > > > extended in a subsequent patch to support obtaining the source
> > > > code from
> > > > in-memory generated buffers rather than from a file. The present
> > > > patch
> > > > refactors class file_cache_slot, putting most of the logic into a
> > > > new base
> > > > class cache_data_source, in preparation for reusing that code in
> > > > the next
> > > > patch. There is no change in functionality yet.
> > > > 
> 
> [...snip...]
> 
> > > 
> > > I confess I had to reread both this and patch 4/8 to make sense of
> > > this; this is probably one of those cases where it's harder to read
> > > in
> > > patch form than as source, but I think I now understand the new
> > > implementation.
> > 
> > Yes, sorry about that. I hope at least splitting into two patches
> > here made it
> > a little easier.
> > 
> > > 
> > > Did you try testing this with valgrind (e.g. "make selftest-
> > > valgrind")?
> > > 
> > 
> > Oh interesting, was not aware of this. I think it shows that new
> > leaks were
> > not introduced with the patch series.
> > 
> 
> [...snip...]
> 
> > 
> > 
> > > I don't think we have any selftest coverage for "\r" in the line-
> > > break
> > > handling; that would be good to add.
> > > 
> > > This patch is OK for trunk once the rest of the kit is approved.
> > 
> > Thank you. To be clear, were you suggesting to add selftest coverage
> > for \r
> > endings now, or in a follow up?
> 
> The former, please, so that we can sure that the patch doesn't
> introduce any buffer overreads etc.
> 
> Thanks
> Dave
>

The following (incremental to patch 5/8 or after) adds selftest coverage for
alternate line endings. I hope things aren't too unclear this way; I can
resend updated versions of some or all of the patches from scratch, if useful.

AFAIK this is the current status of things:

Patch 1/8: Reviewed, updated version incorporating feedback has not been acked
yet, at: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627250.html

Patch 2/8: OKed, pending tweak to reject fixit hints in generated data, which
was sent incrementally here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627405.html

Patch 3/8: OKed, pending new selftest attached to this email.

Patch 4/8: OKed, pending tweak to assert on non-NULL buffers which was sent
incrementally here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628283.html

Patch 5/8: OKed

Patch 6/8: OKed

Patch 7/8: Not reviewed yet

Patch 8/8: Waiting additional feedback from you, perhaps SARIF need not worry
about this for now and should just ignore generated data locations.

Thanks again for taking the time to go through this, I hope it will prove
worth it.

-Lewis

-- >8 --

gcc/ChangeLog:

* input.cc (test_reading_source_line): Test additional cases,
including generated data and alternate line endings.
(input_cc_tests): Adapt to test_reading_source_line() changes.

diff --git a/gcc/input.cc b/gcc/input.cc
index 4c99df7a205..72274732c6c 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -2392,30 +2392,51 @@ test_make_location_nonpure_range_endpoints (const 
line_table_case _)
 /* Verify reading of input files (e.g. for caret-based diagnostics).  */
 
 static void
-test_reading_source_line ()
+test_reading_source_line (bool generated, const char *e1, const char *e2)
 {
   /* Create a tempfile and write some text to it.  */
+  const char *line1 = "01234567890123456789";
+  const char *line2 = "This is the test text";
+  const char *line3 = "This is the 3rd line";
+  char content[72];
+  const int content_len = snprintf (content, sizeof (content),
+   "%s%s%s%s%s",
+   line1, e1, line2, e2, line3);
+  ASSERT_LT (content_len, (int)sizeof (content));
   temp_source_file tmp (SELFTEST_LOCATION, ".txt",
-   "01234567890123456789\n"
-   "This is the test text\n"
-   "This is the 3rd line");
+   content, content_len, gen

Re: [PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

2023-08-23 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 04:08:47PM -0400, Lewis Hyatt wrote:
> On Tue, Aug 15, 2023 at 3:46 PM David Malcolm  wrote:
> >
> > On Tue, 2023-08-15 at 14:15 -0400, Lewis Hyatt wrote:
> > > On Tue, Aug 15, 2023 at 12:15:15PM -0400, David Malcolm wrote:
> > > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > > > This patch enhances location_get_source_line(), which is the
> > > > > primary
> > > > > interface provided by the diagnostics infrastructure to obtain
> > > > > the line of
> > > > > source code corresponding to a given location, so that it
> > > > > understands
> > > > > generated data locations in addition to normal file-based
> > > > > locations. This
> > > > > involves changing the argument to location_get_source_line() from
> > > > > a plain
> > > > > file name, to a source_id object that can represent either type
> > > > > of location.
> > > > >
> >
> > [...]
> >
> > > > >
> > > > >
> > > > > diff --git a/gcc/input.cc b/gcc/input.cc
> > > > > index 9377020b460..790279d4273 100644
> > > > > --- a/gcc/input.cc
> > > > > +++ b/gcc/input.cc
> > > > > @@ -207,6 +207,28 @@ private:
> > > > >void maybe_grow ();
> > > > >  };
> > > > >
> > > > > +/* This is the implementation of cache_data_source for generated
> > > > > +   data that is already in memory.  */
> > > > > +class data_cache_slot final : public cache_data_source
> > > >
> > > > It occurred to me: why are we caching accessing a buffer that's
> > > > already
> > > > in memory - but we're also caching the line-splitting information,
> > > > and
> > > > providing the line-splitting algorithm with a consistent interface
> > > > to
> > > > the data, right?
> > > >
> > >
> > > Yeah, for the current _Pragma use case, multi-line buffers are not
> > > going to
> > > be common, but they can occur. I was mainly motivated by the
> > > consistent
> > > interface, and by the assumption that the overhead is not critical
> > > given a
> > > diagnostic is being issued.
> >
> > (nods)
> >
> > >
> > > > [...snip...]
> > > >
> > > > > @@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file
> > > > > (const char *file_path)
> > > > >global_dc->m_file_cache->forcibly_evict_file (file_path);
> > > > >  }
> > > > >
> > > > > +void
> > > > > +diagnostics_file_cache_forcibly_evict_data (const char *data,
> > > > > +   unsigned int
> > > > > data_len)
> > > > > +{
> > > > > +  if (!global_dc->m_file_cache)
> > > > > +return;
> > > > > +  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
> > > >
> > > > Maybe we should rename diagnostic_context's m_file_cache to
> > > > m_source_cache?  (and class file_cache for that matter?)  But if
> > > > so,
> > > > that can/should be a followup/separate patch.
> > > >
> > >
> > > Yes, we should. Believe it or not, I was trying to minimize the size
> > > of the
> > > patch :)
> >
> > :)
> >
> > Thanks for splitting it up, BTW.
> >
> > [...]
> >
> >
> > > >
> > > > > @@ -912,26 +1000,22 @@ cache_data_source::read_line_num (size_t
> > > > > line_num,
> > > > > If the function fails, a NULL char_span is returned.  */
> > > > >
> > > > >  char_span
> > > > > -location_get_source_line (const char *file_path, int line)
> > > > > +location_get_source_line (source_id src, int line)
> > > > >  {
> > > > > -  const char *buffer = NULL;
> > > > > -  ssize_t len;
> > > > > -
> > > > > -  if (line == 0)
> > > > > -return char_span (NULL, 0);
> > > > > -
> > > > > -  if (file_path == NULL)
> > > > > -return char_span (NULL, 0);
> > > > > +  const char_span fail (nullptr, 0);
> > > > > +  if (!src || line <= 0)
> > > > > +re

Re: [PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

2023-08-15 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 3:46 PM David Malcolm  wrote:
>
> On Tue, 2023-08-15 at 14:15 -0400, Lewis Hyatt wrote:
> > On Tue, Aug 15, 2023 at 12:15:15PM -0400, David Malcolm wrote:
> > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > > This patch enhances location_get_source_line(), which is the
> > > > primary
> > > > interface provided by the diagnostics infrastructure to obtain
> > > > the line of
> > > > source code corresponding to a given location, so that it
> > > > understands
> > > > generated data locations in addition to normal file-based
> > > > locations. This
> > > > involves changing the argument to location_get_source_line() from
> > > > a plain
> > > > file name, to a source_id object that can represent either type
> > > > of location.
> > > >
>
> [...]
>
> > > >
> > > >
> > > > diff --git a/gcc/input.cc b/gcc/input.cc
> > > > index 9377020b460..790279d4273 100644
> > > > --- a/gcc/input.cc
> > > > +++ b/gcc/input.cc
> > > > @@ -207,6 +207,28 @@ private:
> > > >void maybe_grow ();
> > > >  };
> > > >
> > > > +/* This is the implementation of cache_data_source for generated
> > > > +   data that is already in memory.  */
> > > > +class data_cache_slot final : public cache_data_source
> > >
> > > It occurred to me: why are we caching accessing a buffer that's
> > > already
> > > in memory - but we're also caching the line-splitting information,
> > > and
> > > providing the line-splitting algorithm with a consistent interface
> > > to
> > > the data, right?
> > >
> >
> > Yeah, for the current _Pragma use case, multi-line buffers are not
> > going to
> > be common, but they can occur. I was mainly motivated by the
> > consistent
> > interface, and by the assumption that the overhead is not critical
> > given a
> > diagnostic is being issued.
>
> (nods)
>
> >
> > > [...snip...]
> > >
> > > > @@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file
> > > > (const char *file_path)
> > > >global_dc->m_file_cache->forcibly_evict_file (file_path);
> > > >  }
> > > >
> > > > +void
> > > > +diagnostics_file_cache_forcibly_evict_data (const char *data,
> > > > +   unsigned int
> > > > data_len)
> > > > +{
> > > > +  if (!global_dc->m_file_cache)
> > > > +return;
> > > > +  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
> > >
> > > Maybe we should rename diagnostic_context's m_file_cache to
> > > m_source_cache?  (and class file_cache for that matter?)  But if
> > > so,
> > > that can/should be a followup/separate patch.
> > >
> >
> > Yes, we should. Believe it or not, I was trying to minimize the size
> > of the
> > patch :)
>
> :)
>
> Thanks for splitting it up, BTW.
>
> [...]
>
>
> > >
> > > > @@ -912,26 +1000,22 @@ cache_data_source::read_line_num (size_t
> > > > line_num,
> > > > If the function fails, a NULL char_span is returned.  */
> > > >
> > > >  char_span
> > > > -location_get_source_line (const char *file_path, int line)
> > > > +location_get_source_line (source_id src, int line)
> > > >  {
> > > > -  const char *buffer = NULL;
> > > > -  ssize_t len;
> > > > -
> > > > -  if (line == 0)
> > > > -return char_span (NULL, 0);
> > > > -
> > > > -  if (file_path == NULL)
> > > > -return char_span (NULL, 0);
> > > > +  const char_span fail (nullptr, 0);
> > > > +  if (!src || line <= 0)
> > > > +return fail;
> > >
> > > Looking at source_id's operator bool, are there effectively three
> > > kinds
> > > of source_id?
> > >
> > > (a) file names
> > > (b) generated buffer
> > > (c) NULL == m_filename_or_buffer
> > >
> > > What does (c) mean?  Is it a "something's gone wrong/error" state?
> > > Or
> > > is this more a special-case of (a)? (in that the m_len for such a
> > > case
> > > would be zero)
> > >
> > > Should source_id's 2-param ctor h

Re: [PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

2023-08-15 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 12:15:15PM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > This patch enhances location_get_source_line(), which is the primary
> > interface provided by the diagnostics infrastructure to obtain the line of
> > source code corresponding to a given location, so that it understands
> > generated data locations in addition to normal file-based locations. This
> > involves changing the argument to location_get_source_line() from a plain
> > file name, to a source_id object that can represent either type of location.
> > 
> > gcc/ChangeLog:
> > 
> > * input.cc (class data_cache_slot): New class.
> > (file_cache::lookup_data): New function.
> > (diagnostics_file_cache_forcibly_evict_data): New function.
> > (file_cache::forcibly_evict_data): New function.
> > (file_cache::evicted_cache_tab_entry): Generalize (via a template)
> > to work for both file_cache_slot and data_cache_slot.
> > (file_cache::add_file): Adapt for new interface to
> > evicted_cache_tab_entry.
> > (file_cache::add_data): New function.
> > (data_cache_slot::create): New function.
> > (file_cache::file_cache): Support the new m_data_slots member.
> > (file_cache::~file_cache): Likewise.
> > (file_cache::lookup_or_add_data): New function.
> > (file_cache::lookup_or_add): New function that calls either
> > lookup_or_add_data or lookup_or_add_file as appropriate.
> > (location_get_source_line): Change the FILE_PATH argument to a
> > source_id SRC, and use it to support obtaining source lines from
> > generated data as well as from files.
> > (location_compute_display_column): Support generated data using the
> > new features of location_get_source_line.
> > (dump_location_info): Likewise.
> > * input.h (location_get_source_line): Adjust prototype. Add a new
> > convenience overload taking an expanded_location.
> > (class cache_data_source): Declare.
> > (class data_cache_slot): Declare.
> > (class file_cache): Declare new members.
> > (diagnostics_file_cache_forcibly_evict_data): Declare.
> > ---
> >  gcc/input.cc | 171 ---
> >  gcc/input.h  |  23 +--
> >  2 files changed, 153 insertions(+), 41 deletions(-)
> > 
> > diff --git a/gcc/input.cc b/gcc/input.cc
> > index 9377020b460..790279d4273 100644
> > --- a/gcc/input.cc
> > +++ b/gcc/input.cc
> > @@ -207,6 +207,28 @@ private:
> >void maybe_grow ();
> >  };
> >  
> > +/* This is the implementation of cache_data_source for generated
> > +   data that is already in memory.  */
> > +class data_cache_slot final : public cache_data_source
> 
> It occurred to me: why are we caching accessing a buffer that's already
> in memory - but we're also caching the line-splitting information, and
> providing the line-splitting algorithm with a consistent interface to
> the data, right?
>

Yeah, for the current _Pragma use case, multi-line buffers are not going to
be common, but they can occur. I was mainly motivated by the consistent
interface, and by the assumption that the overhead is not critical given a
diagnostic is being issued.

> [...snip...]
> 
> > @@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file (const char 
> > *file_path)
> >global_dc->m_file_cache->forcibly_evict_file (file_path);
> >  }
> >  
> > +void
> > +diagnostics_file_cache_forcibly_evict_data (const char *data,
> > +   unsigned int data_len)
> > +{
> > +  if (!global_dc->m_file_cache)
> > +return;
> > +  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
> 
> Maybe we should rename diagnostic_context's m_file_cache to
> m_source_cache?  (and class file_cache for that matter?)  But if so,
> that can/should be a followup/separate patch.
>

Yes, we should. Believe it or not, I was trying to minimize the size of the
patch :) So I didn't make such changes, but they will make things more
clear.

> [...snip...]
>  
> > @@ -525,10 +582,22 @@ file_cache_slot::create (const 
> > file_cache::input_context _context,
> >return true;
> >  }
> >  
> > +void
> > +data_cache_slot::create (const char *data, unsigned int data_len,
> > +unsigned int highest_use_count)
> > +{
> > +  reset ();
> > +  on_create (highest_use

Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-15 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > Class file_cache_slot in input.cc is used to query specific lines of source
> > code from a file when needed by diagnostics infrastructure. This will be
> > extended in a subsequent patch to support obtaining the source code from
> > in-memory generated buffers rather than from a file. The present patch
> > refactors class file_cache_slot, putting most of the logic into a new base
> > class cache_data_source, in preparation for reusing that code in the next
> > patch. There is no change in functionality yet.
> > 
> > gcc/ChangeLog:
> > 
> > * input.cc (class file_cache_slot): Refactor functionality into a
> > new base class...
> > (class cache_data_source): ...here.
> > (file_cache::forcibly_evict_file): Adapt for refactoring.
> > (file_cache_slot::evict): Renamed to...
> > (file_cache_slot::reset): ...this, and partially refactored into
> > base class...
> > (cache_data_source::reset): ...here.
> > (file_cache_slot::get_full_file_content): Moved into base class...
> > (cache_data_source::get_full_file_content): ...here.
> > (file_cache_slot::create): Adapt for refactoring.
> > (file_cache_slot::file_cache_slot): Refactor partially into...
> > (cache_data_source::cache_data_source): ...here.
> > (file_cache_slot::~file_cache_slot): Refactor partially into...
> > (cache_data_source::~cache_data_source): ...here.
> > (file_cache_slot::needs_read_p): Remove.
> > (file_cache_slot::needs_grow_p): Remove.
> > (file_cache_slot::maybe_grow): Adapt for refactoring.
> > (file_cache_slot::read_data): Refactored, along with...
> > (file_cache_slot::maybe_read_data): this, into...
> > (file_cache_slot::get_more_data): ...here.
> > (find_end_of_line): Change interface to take a pair of pointers,
> > rather than a pointer + length.
> > (file_cache_slot::get_next_line): Refactored into...
> > (cache_data_source::get_next_line): ...here.
> > (file_cache_slot::goto_next_line): Refactored into...
> > (cache_data_source::goto_next_line): ...here.
> > (file_cache_slot::read_line_num): Refactored into...
> > (cache_data_source::read_line_num): ...here.
> > (location_get_source_line): Fix const-correctness as necessitated by
> > new interface.
> > ---
> >  gcc/input.cc | 513 +++
> >  1 file changed, 235 insertions(+), 278 deletions(-)
> > 
> 
> I confess I had to reread both this and patch 4/8 to make sense of
> this; this is probably one of those cases where it's harder to read in
> patch form than as source, but I think I now understand the new
> implementation.

Yes, sorry about that. I hope at least splitting into two patches here made it
a little easier.

> 
> Did you try testing this with valgrind (e.g. "make selftest-valgrind")?
>

Oh interesting, was not aware of this. I think it shows that new leaks were
not introduced with the patch series.

BEFORE patch series:
==1572278==
-fself-test: 7634593 pass(es) in 22.799240 seconds
==1572278==
==1572278== HEAP SUMMARY:
==1572278== in use at exit: 1,083,255 bytes in 2,394 blocks
==1572278==   total heap usage: 2,704,869 allocs, 2,702,475 frees, 
1,257,334,536 bytes allocated
==1572278==
==1572278== 8,032 bytes in 1 blocks are possibly lost in loss record 639 of 657
==1572278==at 0x4848899: malloc (in 
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1572278==by 0x21FE1CB: xmalloc (xmalloc.c:149)
==1572278==by 0x21B02E0: new_buff (lex.cc:4767)
==1572278==by 0x21B02E0: _cpp_get_buff (lex.cc:4800)
==1572278==by 0x21ACC80: cpp_create_reader(c_lang, ht*, line_maps*) 
(init.cc:289)
==1572278==by 0xA64282: c_common_init_options(unsigned int, 
cl_decoded_option*) (c-opts.cc:237)
==1572278==by 0x95E479: toplev::main(int, char**) (toplev.cc:2241)
==1572278==by 0x960B2D: main (main.cc:39)
==1572278==
==1572278== LEAK SUMMARY:
==1572278==definitely lost: 0 bytes in 0 blocks
==1572278==indirectly lost: 0 bytes in 0 blocks
==1572278==  possibly lost: 8,032 bytes in 1 blocks
==1572278==still reachable: 1,075,223 bytes in 2,393 blocks
==1572278== suppressed: 0 bytes in 0 blocks
==1572278== Reachable blocks (those to which a pointer was found) are not shown.
==1572278== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1572278==
==1572278== For lists of detected and suppressed errors, rerun with

Re: [PATCH v4 8/8] diagnostics: Support generated data locations in SARIF output

2023-08-15 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 01:04:04PM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > The diagnostics routines for SARIF output need to read the source code back
> > in, so that they can generate "snippet" and "content" records, so they need 
> > to
> > be able to cope with generated data locations.  Add support for that in
> > diagnostic-format-sarif.cc.
> > 
> > gcc/ChangeLog:
> > 
> > * diagnostic-format-sarif.cc (class sarif_builder): Adapt interface
> > to support generated data locations.
> > (sarif_builder::maybe_make_physical_location_object): Change the
> > m_filenames hash_set to support generated data.
> > (sarif_builder::make_artifact_location_object): Use a source_id 
> > rather
> > than a plain file name.
> > (sarif_builder::maybe_make_region_object): Adapt to
> > expanded_location interface changes.
> > (sarif_builder::maybe_make_region_object_for_context): Likewise.
> > (sarif_builder::make_artifact_object): Likewise.
> > (sarif_builder::make_run_object): Handle generated data.
> > (sarif_builder::maybe_make_artifact_content_object): Likewise.
> > (get_source_lines): Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * c-c++-common/diagnostic-format-sarif-file-5.c: New test.
> 
> I'm not sure if generated data is allowed as part of a SARIF artefact,
> or if there's a more standard-compliant way of representing this; SARIF
> says an artefact is a "sequence of bytes addressable via a URI".
> 
> Can you post a simple example of the generated .sarif JSON please? 
> e.g. from the new test, so that we can see it looks like.
> 
> You could run it through:
> 
>   python -m json.tool 
> 
> to format it for easier reading.

For a simple example like:

_Pragma("GCC diagnostic ignored \"-Wnot-an-option\"")

for which the normal output is:

=
In buffer generated from t.cpp:1:
:1:24: warning: unknown option after ‘#pragma GCC diagnostic’ kind 
[-Wpragmas]
1 | GCC diagnostic ignored "-Wnot-an-option"
  |^
t.cpp:1:1: note: in <_Pragma directive>
1 | _Pragma("GCC diagnostic ignored \"-Wnot-an-option\"")
  | ^~~
=

The SARIF output does not end up referencing any generated data locations,
because those are logically part of the "expansion" of the _Pragma
directive, and it doesn't output macro expansions.  In order for SARIF to
currently do something with generated data, it needs to see a generated data
location in a non-macro context. The only way to get GCC to do that, right
now, is with -fdump-internal-locations, which is what the new test case
does. That just unfortunately generates a larger amount of output. I attached
it, in case that's still helpful, for the following program:

=
_Pragma("GCC diagnostic push")
=

I guess there's potentially already a problem here because 'python -m
json.tool' is unhappy with this output and refuses to process it:

=
Invalid \escape: line 1 column 3436 (char 3435)
=

The related text is:
=
{"location": {"uri": "", "uriBaseId": "PWD"},
"contents":{"text": "GCC diagnostic push\n\0"}
=

And the \0 is not allowed it seems?

I also attached the output of 'python -m json.tool' anyway, after manually
removing the \0.

Is it better to just skip these locations for now?

-Lewis
{"$schema": 
"https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json;,
 "version": "2.1.0", "runs": [{"tool": {"driver": {"name": "GNU C++17", 
"fullName": "GNU C++17 (GCC) version 14.0.0 20230811 (experimental) 
(x86_64-pc-linux-gnu)", "version": "14.0.0 20230811 (experimental)", 
"informationUri": "https://gcc.gnu.org/gcc-14/;, "rules": []}}, "invocations": 
[{"executionSuccessful": true, "toolExecutionNotifications": []}], 
"originalUriBaseIds": {"PWD": {"uri": "file:///home/lewis/"}}, "artifacts": 
[{"location": {"uri": "t.cpp", "uriBaseId": "PWD"}, "contents": {"text": 
"_Pragma(\"GCC diagnostic push\")\n"}, "sourceLanguage": "cplusplus"}, 
{"location": {"uri": "/usr/include/stdc-predef.h"}, "contents": {"text": "/* 
Copyright (C) 1991-2022 Free Software 

Re: [PATCH v4 2/8] libcpp: diagnostics: Support generated data in expanded locations

2023-08-14 Thread Lewis Hyatt via Gcc-patches
On Fri, Aug 11, 2023 at 07:02:49PM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > The previous patch in this series introduced the concept of LC_GEN line
> > maps. This patch continues on the path to using them to improve _Pragma
> > diagnostics, by adding a new source_id SRC member to struct
> > expanded_location, which is populated by linemap_expand_location. This
> > member allows call sites to detect and handle when a location refers to
> > generated data rather than a plain file name.
> > 
> > The previous FILE member of expanded_location is preserved (although
> > redundant with SRC), so that call sites which do not and never will care
> > about generated data do not need to be concerned about it. Call sites that
> > will care are modified here, to use SRC rather than FILE for comparing
> > locations.
> 
> Thanks; this seems like a good approach.
> 
> 
> [...snip...]
> 
> > diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
> > index 6f5bc6b9d8f..15052aec417 100644
> > --- a/gcc/edit-context.cc
> > +++ b/gcc/edit-context.cc
> > @@ -295,7 +295,7 @@ edit_context::apply_fixit (const fixit_hint *hint)
> >  {
> >expanded_location start = expand_location (hint->get_start_loc ());
> >expanded_location next_loc = expand_location (hint->get_next_loc ());
> > -  if (start.file != next_loc.file)
> > +  if (start.src != next_loc.src || start.src.is_buffer ())
> >  return false;
> >if (start.line != next_loc.line)
> >  return false;
> 
> Thinking about fix-it hints, it makes sense to reject attempts to
> create fix-it hints within generated strings, as we can't apply them or
> visualize them.
> 
> Does anywhere in the patch kit do that?  Either of 
>   rich_location::maybe_add_fixit
> or
>   rich_location::reject_impossible_fixit
> would be good places to do that.
>

So rich_location::reject_impossible_fixit does reject them for _Pragmas now,
because what the frontend sees and passes to it is a virtual location, and it
always rejects virtual locations. But it doesn't reject arbitrary generated
data locations that may be created in an ordinary non-virtual location. I
think it's this one-line change to reject those:

-- >8 --

diff --git a/libcpp/line-map.cc b/libcpp/line-map.cc
index 835e8e1b8cd..382594637ad 100644
--- a/libcpp/line-map.cc
+++ b/libcpp/line-map.cc
@@ -2545,7 +2545,8 @@ rich_location::maybe_add_fixit (location_t start,
 = linemap_client_expand_location_to_spelling_point (next_loc,
LOCATION_ASPECT_START);
   /* They must be within the same file...  */
-  if (exploc_start.src != exploc_next_loc.src)
+  if (exploc_start.src != exploc_next_loc.src
+  || exploc_start.src.is_buffer ())
 {
   stop_supporting_fixits ();
   return;

-- >8 --

However, there are many selftests in diagnostic-show-locus.cc that actually
verify we generate the fixit hints for generated data, so I would need also to
change those to skip the test in this case as well. That looks like this:

-- >8 --

diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc
index 62c60645e88..884c55e91e9 100644
--- a/gcc/diagnostic-show-locus.cc
+++ b/gcc/diagnostic-show-locus.cc
@@ -3824,6 +3824,8 @@ test_diagnostic_show_locus_one_liner (const 
line_table_case _)
   test_one_liner_simple_caret ();
   test_one_liner_caret_and_range ();
   test_one_liner_multiple_carets_and_ranges ();
+  if (!ltt.m_generated_data)
+{
   test_one_liner_fixit_insert_before ();
   test_one_liner_fixit_insert_after ();
   test_one_liner_fixit_remove ();
@@ -3835,6 +3837,7 @@ test_diagnostic_show_locus_one_liner (const 
line_table_case _)
   test_one_liner_many_fixits_2 ();
   test_one_liner_labels ();
 }
+}

 /* Version of all one-liner tests exercising multibyte awareness.  For
simplicity we stick to using two multibyte characters in the test, U+1F602
@@ -4419,6 +4422,8 @@ test_diagnostic_show_locus_one_liner_utf8 (const 
line_table_case _)
   test_one_liner_simple_caret_utf8 ();
   test_one_liner_caret_and_range_utf8 ();
   test_one_liner_multiple_carets_and_ranges_utf8 ();
+  if (!ltt.m_generated_data)
+{
   test_one_liner_fixit_insert_before_utf8 ();
   test_one_liner_fixit_insert_after_utf8 ();
   test_one_liner_fixit_remove_utf8 ();
@@ -4428,6 +4433,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const 
line_table_case _)
   test_one_liner_fixit_validation_adhoc_locations_utf8 ();
   test_one_liner_many_fixits_1_utf8 ();
   test_one_liner_many_fixits_2_utf8 ();
+}
   test_one_liner_labels_utf8 ();
   test_one_liner_colorized_utf8 ();
 }
@@ -5726,15 +5732,15 @@ diagnostic_show_locus_cc_tests ()
   for_each_line_table_cas

Re: [PATCH v4 1/8] libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-08-13 Thread Lewis Hyatt via Gcc-patches
On Fri, Aug 11, 2023 at 06:45:31PM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> 
> Hi Lewis, thanks for the patch...
> 
> > Add a new linemap reason LC_GEN which enables encoding the location of data
> > that was generated during compilation and does not appear in any source 
> > file.
> > There could be many use cases, such as, for instance, referring to the 
> > content
> > of builtin macros (not yet implemented, but an easy lift after this one.) 
> > The
> > first intended application is to create a place to store the input to a
> > _Pragma directive, so that proper locations can be assigned to those
> > tokens. This will be done in a subsequent commit.
> > 
> > The TO_FILE member of struct line_map_ordinary has been changed to a union
> > named SRC which can be either a file name, or a pointer to a line_map_data
> > struct describing the data. There is no space overhead added to the line
> > maps data structures.
> > 
> > Outside libcpp, this patch includes only the minimal changes implied by the
> > adjustment from TO_FILE to SRC in struct line_map_ordinary. Subsequent
> > patches will implement the new functionality.
> > 
> > libcpp/ChangeLog:
> > 
> > * include/line-map.h (enum lc_reason): Add LC_GEN.
> > (struct line_map_data): New struct.
> > (struct line_map_ordinary): Change TO_FILE from a char* to a union,
> > and rename to SRC.
> > (class source_id): New class.
> > (ORDINARY_MAP_GENERATED_DATA_P): New function.
> > (ORDINARY_MAP_GENERATED_DATA): New function.
> > (ORDINARY_MAP_GENERATED_DATA_LEN): New function.
> > (ORDINARY_MAP_SOURCE_ID): New function.
> > (ORDINARY_MAPS_SAME_FILE_P): New function.
> > (ORDINARY_MAP_CONTAINING_FILE_NAME): Declare.
> > (LINEMAP_FILE): Adapt to struct line_map_ordinary change.
> > (linemap_get_file_highest_location): Likewise.
> > * line-map.cc (source_id::operator==): New function.
> > (ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
> > (linemap_add): Support creating LC_GEN maps.
> > (linemap_line_start): Support LC_GEN maps.
> > (linemap_check_files_exited): Likewise.
> > (linemap_position_for_loc_and_offset): Likewise.
> > (linemap_get_expansion_filename): Likewise.
> > (linemap_dump): Likewise.
> > (linemap_dump_location): Likewise.
> > (linemap_get_file_highest_location): Likewise.
> > * directives.cc (_cpp_do_file_change): Likewise.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c-common.cc (try_to_locate_new_include_insertion_point): Recognize
> > and ignore LC_GEN maps.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * module.cc (module_state::write_ordinary_maps): Recognize and
> > ignore LC_GEN maps, and adapt to interface change in struct
> > line_map_ordinary.
> > (module_state::read_ordinary_maps): Likewise.
> > 
> > gcc/ChangeLog:
> > 
> > * diagnostic-show-locus.cc (compatible_locations_p): Adapt to
> > interface change in struct line_map_ordinary.
> > * input.cc (special_fname_generated): New function.
> > (dump_location_info): Support LC_GEN maps.
> > (get_substring_ranges_for_loc): Adapt to interface change in struct
> > line_map_ordinary.
> > * input.h (special_fname_generated): Declare.
> > 
> > gcc/go/ChangeLog:
> > 
> > * go-linemap.cc (Gcc_linemap::to_string): Recognize and ignore
> > LC_GEN maps.
> > ---
> >  gcc/c-family/c-common.cc |  11 ++-
> >  gcc/cp/module.cc |   8 +-
> >  gcc/diagnostic-show-locus.cc |   2 +-
> >  gcc/go/go-linemap.cc |   3 +-
> >  gcc/input.cc |  27 +-
> >  gcc/input.h  |   1 +
> >  libcpp/directives.cc |   4 +-
> >  libcpp/include/line-map.h    | 144 
> >  libcpp/line-map.cc   | 181 +--
> >  9 files changed, 299 insertions(+), 82 deletions(-)
> 
> [...snip...]
> 
> > 
> > diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc
> > index 0514815b51f..a2aa6b4e0b5 100644
> > --- a/gcc/diagnostic-show-locus.cc
> > +++ b/gcc/diagnostic-show-locus.cc
> > @@ -998,7 +998,7 @@ compatible_locations_p (location_t loc_a, location_t 
> > loc_b)
> >  are in the 

[PATCH v4 7/8] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Currently, the tokens obtained from a destringified _Pragma string do not get
assigned proper locations while they are being lexed.  After the tokens have
been obtained, they are reassigned the same location as the _Pragma token,
which is sufficient to make things like _Pragma("GCC diagnostic ignored...")
operate correctly, but this still results in inferior diagnostics, since the
diagnostics do not point to the problematic tokens.  Further, if a diagnostic
is issued by libcpp during the lexing of the tokens, as opposed to being
issued by the frontend during the processing of the pragma, then the
patched-up location is not yet in place, and the user rather sees an invalid
location that is near to the location of the _Pragma string in some cases, or
potentially very far away, depending on the macro expansion history.  For
example:

=
_Pragma("GCC diagnostic ignored \"oops")
=

produces the diagnostic:

file.cpp:1:24: warning: missing terminating " character
1 | _Pragma("GCC diagnostic ignored \"oops")
  |^

with the caret in a nonsensical location, while this one:

=
 #define S "GCC diagnostic ignored \"oops"
_Pragma(S)
=

produces:

file.cpp:2:24: warning: missing terminating " character
2 | _Pragma(S)
  |^

with both the caret in a nonsensical location, and the actual relevant context
completely absent.

Fix this by assigning proper locations using the new LC_GEN type of linemap.
Now the tokens are given locations inside a generated content buffer, and the
macro expansion stack is modified to be aware that these tokens logically
belong to the "expansion" of the _Pragma directive. For the above examples we
now output:

==
In buffer generated from file.cpp:1:
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:1:1: note: in <_Pragma directive>
1 | _Pragma("GCC diagnostic ignored \"oops")
  | ^~~
==

and

==
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:2:1: note: in <_Pragma directive>
2 | _Pragma(S)
  | ^~~
==

So that carets are pointing to something meaningful and all relevant context
appears in the diagnostic.  For the second example, it would be nice if the
macro expansion also output "in expansion of macro S", however doing that for
a general case of macro expansions makes the logic very complicated, since it
has to be done after the fact when the macro maps have already been
constructed.  It doesn't seem worth it for this case, given that the _Pragma
string has already been output once on the first line.

gcc/ChangeLog:

* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Add awareness
of _Pragma directive to the macro expansion trace.

libcpp/ChangeLog:

* directives.cc (get_token_no_padding): Add argument to receive the
virtual location of the token.
(get__Pragma_string): Likewise.
(do_pragma): Set pfile->directive_result->src_loc properly, it should
not be a virtual location.
(destringize_and_run): Update to provide proper locations for the
_Pragma string tokens.  Support raw strings.
(_cpp_do__Pragma): Adapt to changes to the helper functions.
* errors.cc (cpp_diagnostic_at): Support
cpp_reader::diagnostic_rebase_loc.
(cpp_diagnostic_with_line): Likewise.
* include/line-map.h (class rich_location): Add new member
forget_cached_expanded_locations().
* internal.h (struct _cpp__Pragma_state): Define new struct.
(_cpp_rebase_diagnostic_location): Declare new function.
(struct cpp_reader): Add diagnostic_rebase_loc member.
(_cpp_push__Pragma_token_context): Declare new function.
(_cpp_do__Pragma): Adjust prototype.
* macro.cc (pragma_str): New static var.
(builtin_macro): Adapt to new implementation of _Pragma processing.
(_cpp_pop_context): Fix the logic for resetting
pfile->top_most_macro_node, which previously was never triggered,
although the error seems to have been harmless.
(_cpp_push__Pragma_token_context): New function.
(_cpp_rebase_diagnostic_location): New function.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (token_streamer::stream): Pass the virtual location of
the _Pragma token to maybe_print_line(), not the spelling location.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adjust for new
macro tracking output for _Pragma directives.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/diagnostic-pragma-1.c: Adjust for new macro
tracking output for _Pragma directives.
* c-c++-common/cpp/pr57580.c: Likewise.
* c-c++-common/gomp/pragma-3.c: Likewise.

[PATCH v4 6/8] diagnostics: Full support for generated data locations

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Previous patches in this series have laid the groundwork for supporting
source code locations in memory ("generated data") rather than ordinary
files. This patch completes the support by adding awareness of such
locations to all places that need to support them. The main changes are to
diagnostic-show-locus.cc; the others are primarily small tweaks such as
changing from the FILE to the SRC member when inspecting an
expanded_location.

gcc/c-family/ChangeLog:

* c-format.cc (get_corrected_substring): Use the new overload of
location_get_source_line() to support generated data.
* c-indentation.cc (get_visual_column): Likewise.
(get_first_nws_vis_column): Change argument from a plain file name
to a source_id.
(detect_intervening_unindent): Likewise.
(should_warn_for_misleading_indentation): Pass
detect_intervening_unindent() the SRC field rather than the FILE
field from the expanded_location.

gcc/ChangeLog:

* gcc-rich-location.cc (blank_line_before_p): Use the new overload
of location_get_source_line() to support generated data.
* input.cc (get_source_text_between): Likewise.
(get_substring_ranges_for_loc): Likewise.
(get_source_file_content): Change the argument from a plain filename
to a source_id.
(location_missing_trailing_newline): Likewise.
* input.h (get_source_file_content): Adjust prototype.
(location_missing_trailing_newline): Likewise.
* diagnostic-show-locus.cc (layout::calculate_x_offset_display): Use
the new overload of location_get_source_line() to support generated
data.
(layout::print_line): Likewise.
(class line_corrections): Change m_filename from a plain filename to
a source_id.
(source_line::source_line): Change argument from a plain filename to
a source_id.
(line_corrections::add_hint): Adapt to source_line change.
(layout::print_trailing_fixits): Adapt to line_corrections change.
(test_layout_x_offset_display_utf8): Test generated data too.
(test_layout_x_offset_display_tab): Likewise.
(test_diagnostic_show_locus_one_liner): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Likewise.
(test_add_location_if_nearby): Likewise.
(test_diagnostic_show_locus_fixit_lines): Likewise.
(test_fixit_consolidation): Likewise.
(test_overlapped_fixit_printing): Likewise.
(test_overlapped_fixit_printing_utf8): Likewise.
(test_overlapped_fixit_printing_2): Likewise.
(test_fixit_insert_containing_newline): Likewise.
(test_fixit_insert_containing_newline_2): Likewise.
(test_fixit_replace_containing_newline): Likewise.
(test_fixit_deletion_affecting_newline): Likewise.
(test_tab_expansion): Likewise.
(test_escaping_bytes_1): Likewise.
(test_escaping_bytes_2): Likewise.
(test_line_numbers_multiline_range): Likewise.
(diagnostic_show_locus_cc_tests): Likewise.
---
 gcc/c-family/c-format.cc  |   2 +-
 gcc/c-family/c-indentation.cc |   8 +-
 gcc/diagnostic-show-locus.cc  | 227 ++
 gcc/gcc-rich-location.cc  |   2 +-
 gcc/input.cc  |  21 ++--
 gcc/input.h   |   6 +-
 6 files changed, 136 insertions(+), 130 deletions(-)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index 529b1408179..929ec24622c 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4537,7 +4537,7 @@ get_corrected_substring (const substring_loc _loc,
   if (caret.column > finish.column)
 return NULL;
 
-  char_span line = location_get_source_line (start.file, start.line);
+  char_span line = location_get_source_line (start);
   if (!line)
 return NULL;
 
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index fce74991aae..27a90d9cc15 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
   unsigned int *first_nws,
   unsigned int tab_width)
 {
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
 return false;
   if ((size_t)exploc.column > line.length ())
@@ -87,7 +87,7 @@ get_visual_column (expanded_location exploc,
Otherwise, return false, leaving *FIRST_NWS untouched.  */
 
 static bool
-get_first_nws_vis_column (const char *file, int line_num,
+get_first_nws_vis_column (source_id file, int line_num,
  unsigned int *first_nws,
  unsigned int tab_width)
 {
@@ -158,7 +158,7 @@ get_first_nws_vis_column (const char *file, int line_num,
Return true if such an unindent/outdent is detected.  */
 
 static bool
-detect_intervening_unindent (const char *file,

[PATCH v4 5/8] diagnostics: Support testing generated data in input.cc selftests

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Add selftests for the new capabilities in input.cc related to source code
locations that are stored in memory rather than ordinary files.

gcc/ChangeLog:

* input.cc (temp_source_file::do_linemap_add): New function.
(line_table_case::line_table_case): Add GENERATED_DATA argument.
(line_table_test::line_table_test): Implement new M_GENERATED_DATA
argument.
(for_each_line_table_case): Optionally include generated data
locations in the set of cases.
(test_accessing_ordinary_linemaps): Test generated data locations.
(test_make_location_nonpure_range_endpoints): Likewise.
(test_line_offset_overflow): Likewise.
(input_cc_tests): Likewise.
* selftest.cc (named_temp_file::named_temp_file): Interpret a null
SUFFIX argument as a request to use in-memory data.
(named_temp_file::~named_temp_file): Support in-memory data.
(temp_source_file::temp_source_file): Likewise.
(temp_source_file::~temp_source_file): Likewise.
* selftest.h (struct line_map_ordinary): Foward declare.
(class named_temp_file): Add missing explicit to the constructor.
(class temp_source_file): Add new members to support in-memory data.
(class line_table_test): Likewise.
(for_each_line_table_case): Adjust prototype.
---
 gcc/input.cc| 81 +
 gcc/selftest.cc | 53 +---
 gcc/selftest.h  | 19 ++--
 3 files changed, 113 insertions(+), 40 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index 790279d4273..8c4e40aaf23 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -2066,6 +2066,20 @@ get_num_source_ranges_for_substring (cpp_reader *pfile,
 
 /* Selftests of location handling.  */
 
+/* Wrapper around linemap_add to handle transparently adding either a tmp file,
+   or in-memory generated content.  */
+const line_map_ordinary *
+temp_source_file::do_linemap_add (int line)
+{
+  const line_map *map;
+  if (content_buf)
+map = linemap_add (line_table, LC_GEN, false, content_buf,
+  line, content_len);
+  else
+map = linemap_add (line_table, LC_ENTER, false, get_filename (), line);
+  return linemap_check_ordinary (map);
+}
+
 /* Verify that compare() on linenum_type handles comparisons over the full
range of the type.  */
 
@@ -2144,13 +2158,16 @@ assert_loceq (const char *exp_filename, int 
exp_linenum, int exp_colnum,
 class line_table_case
 {
 public:
-  line_table_case (int default_range_bits, int base_location)
+  line_table_case (int default_range_bits, int base_location,
+  bool generated_data)
   : m_default_range_bits (default_range_bits),
-m_base_location (base_location)
+m_base_location (base_location),
+m_generated_data (generated_data)
   {}
 
   int m_default_range_bits;
   int m_base_location;
+  bool m_generated_data;
 };
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2167,6 +2184,7 @@ line_table_test::line_table_test ()
   gcc_assert (saved_line_table->round_alloc_size);
   line_table->round_alloc_size = saved_line_table->round_alloc_size;
   line_table->default_range_bits = 0;
+  m_generated_data = false;
 }
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2188,6 +2206,7 @@ line_table_test::line_table_test (const line_table_case 
_)
   line_table->highest_location = case_.m_base_location;
   line_table->highest_line = case_.m_base_location;
 }
+  m_generated_data = case_.m_generated_data;
 }
 
 /* Destructor.  Restore the old value of line_table.  */
@@ -2207,7 +2226,10 @@ test_accessing_ordinary_linemaps (const line_table_case 
_)
   line_table_test ltt (case_);
 
   /* Build a simple linemap describing some locations. */
-  linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
+  if (ltt.m_generated_data)
+linemap_add (line_table, LC_GEN, false, "some data", 0, 10);
+  else
+linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
 
   linemap_line_start (line_table, 1, 100);
   location_t loc_a = linemap_position_for_column (line_table, 1);
@@ -2257,21 +2279,23 @@ test_accessing_ordinary_linemaps (const line_table_case 
_)
   linemap_add (line_table, LC_LEAVE, false, NULL, 0);
 
   /* Verify that we can recover the location info.  */
-  assert_loceq ("foo.c", 1, 1, loc_a);
-  assert_loceq ("foo.c", 1, 23, loc_b);
-  assert_loceq ("foo.c", 2, 1, loc_c);
-  assert_loceq ("foo.c", 2, 17, loc_d);
-  assert_loceq ("foo.c", 3, 700, loc_e);
-  assert_loceq ("foo.c", 4, 100, loc_back_to_short);
+  const auto fname
+= (ltt.m_generated_data ? special_fname_generated () : "foo.c");
+  assert_loceq (fname, 1, 1, loc_a);
+  assert_loceq (fname, 1, 23, loc_b);
+  assert_loceq (fname, 2, 1, loc_c);
+  assert_loceq (fname, 2, 17, loc_d);
+  assert_loceq (fname, 3, 700, loc_e);
+  assert_loceq (fname, 4, 100, loc_back_to_short);
 
   /* In the very wide line, the 

[PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Class file_cache_slot in input.cc is used to query specific lines of source
code from a file when needed by diagnostics infrastructure. This will be
extended in a subsequent patch to support obtaining the source code from
in-memory generated buffers rather than from a file. The present patch
refactors class file_cache_slot, putting most of the logic into a new base
class cache_data_source, in preparation for reusing that code in the next
patch. There is no change in functionality yet.

gcc/ChangeLog:

* input.cc (class file_cache_slot): Refactor functionality into a
new base class...
(class cache_data_source): ...here.
(file_cache::forcibly_evict_file): Adapt for refactoring.
(file_cache_slot::evict): Renamed to...
(file_cache_slot::reset): ...this, and partially refactored into
base class...
(cache_data_source::reset): ...here.
(file_cache_slot::get_full_file_content): Moved into base class...
(cache_data_source::get_full_file_content): ...here.
(file_cache_slot::create): Adapt for refactoring.
(file_cache_slot::file_cache_slot): Refactor partially into...
(cache_data_source::cache_data_source): ...here.
(file_cache_slot::~file_cache_slot): Refactor partially into...
(cache_data_source::~cache_data_source): ...here.
(file_cache_slot::needs_read_p): Remove.
(file_cache_slot::needs_grow_p): Remove.
(file_cache_slot::maybe_grow): Adapt for refactoring.
(file_cache_slot::read_data): Refactored, along with...
(file_cache_slot::maybe_read_data): this, into...
(file_cache_slot::get_more_data): ...here.
(find_end_of_line): Change interface to take a pair of pointers,
rather than a pointer + length.
(file_cache_slot::get_next_line): Refactored into...
(cache_data_source::get_next_line): ...here.
(file_cache_slot::goto_next_line): Refactored into...
(cache_data_source::goto_next_line): ...here.
(file_cache_slot::read_line_num): Refactored into...
(cache_data_source::read_line_num): ...here.
(location_get_source_line): Fix const-correctness as necessitated by
new interface.
---
 gcc/input.cc | 513 +++
 1 file changed, 235 insertions(+), 278 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index c2559614a99..9377020b460 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -55,34 +55,88 @@ file_cache::initialize_input_context 
(diagnostic_input_charset_callback ccb,
   in_context.should_skip_bom = should_skip_bom;
 }
 
-/* This is a cache used by get_next_line to store the content of a
-   file to be searched for file lines.  */
-class file_cache_slot
+/* This is an abstract interface for a class that provides data which we want 
to
+   look up by line number.  Concrete implementations will follow, which handle
+   the cases of reading the data from the input source files, or of reading it
+   from in-memory generated data buffers.  The design is driven with reading
+   from files in mind, in particular it is desirable to read only as much of a
+   file from disk as necessary.  It works like a simplified std::istream, i.e.
+   virtual function calls are only needed when we need to retrieve more data
+   from the underlying source.  */
+
+class cache_data_source
 {
-public:
-  file_cache_slot ();
-  ~file_cache_slot ();
 
-  bool read_line_num (size_t line_num,
- char ** line, ssize_t *line_len);
-
-  /* Accessors.  */
-  const char *get_file_path () const { return m_file_path; }
+public:
+  bool read_line_num (size_t line_num, const char **line, ssize_t *line_len);
   unsigned get_use_count () const { return m_use_count; }
+  void inc_use_count () { m_use_count++; }
+  bool get_next_line (const char **line, ssize_t *line_len);
+  bool goto_next_line ();
   bool missing_trailing_newline_p () const
   {
 return m_missing_trailing_newline;
   }
   char_span get_full_file_content ();
+  bool unused () const { return !m_data_begin; }
+  virtual void reset ();
+
+protected:
+  cache_data_source ();
+  virtual ~cache_data_source ();
+
+  /* These pointers delimit the data that we are processing.  They are
+ maintained by the derived classes, we only ask for more by calling
+ get_more_data().  That function should return TRUE if more data was
+ obtained.  Calling get_more_data () may invalidate these pointers
+ (i.e. reallocating them to a larger buffer).  */
+  const char *m_data_begin;
+  const char *m_data_end;
+  virtual bool get_more_data () = 0;
+
+  /* This is to be called by the derived classes when this object is
+ being activated.  */
+  void on_create (unsigned int use_count, size_t total_lines)
+  {
+m_use_count = use_count;
+m_total_lines = total_lines;
+  }
 
-  void inc_use_count () { m_use_count++; }
+private:
+  /* Non-copyable.  */
+  cache_data_source (const 

[PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

2023-08-09 Thread Lewis Hyatt via Gcc-patches
This patch enhances location_get_source_line(), which is the primary
interface provided by the diagnostics infrastructure to obtain the line of
source code corresponding to a given location, so that it understands
generated data locations in addition to normal file-based locations. This
involves changing the argument to location_get_source_line() from a plain
file name, to a source_id object that can represent either type of location.

gcc/ChangeLog:

* input.cc (class data_cache_slot): New class.
(file_cache::lookup_data): New function.
(diagnostics_file_cache_forcibly_evict_data): New function.
(file_cache::forcibly_evict_data): New function.
(file_cache::evicted_cache_tab_entry): Generalize (via a template)
to work for both file_cache_slot and data_cache_slot.
(file_cache::add_file): Adapt for new interface to
evicted_cache_tab_entry.
(file_cache::add_data): New function.
(data_cache_slot::create): New function.
(file_cache::file_cache): Support the new m_data_slots member.
(file_cache::~file_cache): Likewise.
(file_cache::lookup_or_add_data): New function.
(file_cache::lookup_or_add): New function that calls either
lookup_or_add_data or lookup_or_add_file as appropriate.
(location_get_source_line): Change the FILE_PATH argument to a
source_id SRC, and use it to support obtaining source lines from
generated data as well as from files.
(location_compute_display_column): Support generated data using the
new features of location_get_source_line.
(dump_location_info): Likewise.
* input.h (location_get_source_line): Adjust prototype. Add a new
convenience overload taking an expanded_location.
(class cache_data_source): Declare.
(class data_cache_slot): Declare.
(class file_cache): Declare new members.
(diagnostics_file_cache_forcibly_evict_data): Declare.
---
 gcc/input.cc | 171 ---
 gcc/input.h  |  23 +--
 2 files changed, 153 insertions(+), 41 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index 9377020b460..790279d4273 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -207,6 +207,28 @@ private:
   void maybe_grow ();
 };
 
+/* This is the implementation of cache_data_source for generated
+   data that is already in memory.  */
+class data_cache_slot final : public cache_data_source
+{
+public:
+  void create (const char *data, unsigned int data_len,
+  unsigned int highest_use_count);
+  bool represents_data (const char *data, unsigned int) const
+  {
+/* We can just use pointer equality here since the generated data lives in
+   memory in one persistent place.  It isn't anticipated there would be
+   several generated data buffers with the same content, so we don't mind
+   that in such a case we will store it twice.  */
+return m_data_begin == data;
+  }
+
+protected:
+  /* In contrast to file_cache_slot, we do not own a buffer.  The buffer
+ passed to create() needs to outlive this object.  */
+  bool get_more_data () override { return false; }
+};
+
 /* Current position in real source file.  */
 
 location_t input_location = UNKNOWN_LOCATION;
@@ -382,6 +404,21 @@ file_cache::lookup_file (const char *file_path)
   return r;
 }
 
+data_cache_slot *
+file_cache::lookup_data (const char *data, unsigned int data_len)
+{
+  for (unsigned int i = 0; i != num_file_slots; ++i)
+{
+  const auto slot = m_data_slots + i;
+  if (slot->represents_data (data, data_len))
+   {
+ slot->inc_use_count ();
+ return slot;
+   }
+}
+  return nullptr;
+}
+
 /* Purge any mention of FILENAME from the cache of files used for
printing source code.  For use in selftests when working
with tempfiles.  */
@@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file (const char 
*file_path)
   global_dc->m_file_cache->forcibly_evict_file (file_path);
 }
 
+void
+diagnostics_file_cache_forcibly_evict_data (const char *data,
+   unsigned int data_len)
+{
+  if (!global_dc->m_file_cache)
+return;
+  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
+}
+
 void
 file_cache::forcibly_evict_file (const char *file_path)
 {
@@ -410,36 +456,36 @@ file_cache::forcibly_evict_file (const char *file_path)
   r->reset ();
 }
 
+void
+file_cache::forcibly_evict_data (const char *data, unsigned int data_len)
+{
+  if (auto r = lookup_data (data, data_len))
+r->reset ();
+}
+
 /* Return the cache that has been less used, recently, or the
first empty one.  If HIGHEST_USE_COUNT is non-null,
*HIGHEST_USE_COUNT is set to the highest use count of the entries
in the cache table.  */
 
-file_cache_slot*
-file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
+template 
+Slot *
+file_cache::evicted_cache_tab_entry (Slot 

[PATCH v4 8/8] diagnostics: Support generated data locations in SARIF output

2023-08-09 Thread Lewis Hyatt via Gcc-patches
The diagnostics routines for SARIF output need to read the source code back
in, so that they can generate "snippet" and "content" records, so they need to
be able to cope with generated data locations.  Add support for that in
diagnostic-format-sarif.cc.

gcc/ChangeLog:

* diagnostic-format-sarif.cc (class sarif_builder): Adapt interface
to support generated data locations.
(sarif_builder::maybe_make_physical_location_object): Change the
m_filenames hash_set to support generated data.
(sarif_builder::make_artifact_location_object): Use a source_id rather
than a plain file name.
(sarif_builder::maybe_make_region_object): Adapt to
expanded_location interface changes.
(sarif_builder::maybe_make_region_object_for_context): Likewise.
(sarif_builder::make_artifact_object): Likewise.
(sarif_builder::make_run_object): Handle generated data.
(sarif_builder::maybe_make_artifact_content_object): Likewise.
(get_source_lines): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/diagnostic-format-sarif-file-5.c: New test.
---
 gcc/diagnostic-format-sarif.cc| 88 +++
 .../diagnostic-format-sarif-file-5.c  | 31 +++
 2 files changed, 82 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 1eff71962d7..c7c0e5d4b0a 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -174,7 +174,7 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+  json::object *make_artifact_location_object (source_id src);
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -197,9 +197,9 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) 
const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
+  json::object *make_artifact_object (source_id src);
+  json::object *maybe_make_artifact_content_object (source_id src) const;
+  json::object *maybe_make_artifact_content_object (source_id src,
int start_line,
int end_line) const;
   json::object *make_fix_object (const rich_location _loc);
@@ -220,7 +220,11 @@ private:
  diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set  m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+ with that length, not a filename.  */
+  hash_set ,
+  int_hash  >
+   > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set  m_rule_id_set;
   json::array *m_rules_arr;
@@ -787,7 +791,8 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  const auto src = LOCATION_SRC (loc);
+  m_filenames.add ({src.get_filename_or_buffer (), src.get_buffer_len ()});
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -811,7 +816,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (LOCATION_SRC (loc));
 }
 
 /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0 section 3.4.4)
@@ -823,10 +828,13 @@ sarif_builder::make_artifact_location_object (location_t 
loc)
or return NULL.  */
 
 json::object *
-sarif_builder::make_artifact_location_object (const char *filename)
+sarif_builder::make_artifact_location_object (source_id src)
 {
   json::object *artifact_loc_obj = new json::object ();
 
+  const auto filename = src.is_buffer ()
+? special_fname_generated () : src.get_filename_or_buffer ();
+
   /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
   artifact_loc_obj->set ("uri", new json::string (filename));
 
@@ -912,9 +920,9 @@ sarif_builder::maybe_make_region_object (location_t loc) 

[PATCH v4 1/8] libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The TO_FILE member of struct line_map_ordinary has been changed to a union
named SRC which can be either a file name, or a pointer to a line_map_data
struct describing the data. There is no space overhead added to the line
maps data structures.

Outside libcpp, this patch includes only the minimal changes implied by the
adjustment from TO_FILE to SRC in struct line_map_ordinary. Subsequent
patches will implement the new functionality.

libcpp/ChangeLog:

* include/line-map.h (enum lc_reason): Add LC_GEN.
(struct line_map_data): New struct.
(struct line_map_ordinary): Change TO_FILE from a char* to a union,
and rename to SRC.
(class source_id): New class.
(ORDINARY_MAP_GENERATED_DATA_P): New function.
(ORDINARY_MAP_GENERATED_DATA): New function.
(ORDINARY_MAP_GENERATED_DATA_LEN): New function.
(ORDINARY_MAP_SOURCE_ID): New function.
(ORDINARY_MAPS_SAME_FILE_P): New function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): Declare.
(LINEMAP_FILE): Adapt to struct line_map_ordinary change.
(linemap_get_file_highest_location): Likewise.
* line-map.cc (source_id::operator==): New function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
(linemap_add): Support creating LC_GEN maps.
(linemap_line_start): Support LC_GEN maps.
(linemap_check_files_exited): Likewise.
(linemap_position_for_loc_and_offset): Likewise.
(linemap_get_expansion_filename): Likewise.
(linemap_dump): Likewise.
(linemap_dump_location): Likewise.
(linemap_get_file_highest_location): Likewise.
* directives.cc (_cpp_do_file_change): Likewise.

gcc/c-family/ChangeLog:

* c-common.cc (try_to_locate_new_include_insertion_point): Recognize
and ignore LC_GEN maps.

gcc/cp/ChangeLog:

* module.cc (module_state::write_ordinary_maps): Recognize and
ignore LC_GEN maps, and adapt to interface change in struct
line_map_ordinary.
(module_state::read_ordinary_maps): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (compatible_locations_p): Adapt to
interface change in struct line_map_ordinary.
* input.cc (special_fname_generated): New function.
(dump_location_info): Support LC_GEN maps.
(get_substring_ranges_for_loc): Adapt to interface change in struct
line_map_ordinary.
* input.h (special_fname_generated): Declare.

gcc/go/ChangeLog:

* go-linemap.cc (Gcc_linemap::to_string): Recognize and ignore
LC_GEN maps.
---
 gcc/c-family/c-common.cc |  11 ++-
 gcc/cp/module.cc |   8 +-
 gcc/diagnostic-show-locus.cc |   2 +-
 gcc/go/go-linemap.cc |   3 +-
 gcc/input.cc |  27 +-
 gcc/input.h  |   1 +
 libcpp/directives.cc |   4 +-
 libcpp/include/line-map.h| 144 
 libcpp/line-map.cc   | 181 +--
 9 files changed, 299 insertions(+), 82 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 9fbaeb437a1..ecfc2efc29f 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9206,19 +9206,22 @@ try_to_locate_new_include_insertion_point (const char 
*file, location_t loc)
   const line_map_ordinary *ord_map
= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
+  if (ORDINARY_MAP_GENERATED_DATA_P (ord_map))
+   continue;
+
   if (const line_map_ordinary *from
  = linemap_included_from_linemap (line_table, ord_map))
/* We cannot use pointer equality, because with preprocessed
   input all filename strings are unique.  */
-   if (0 == strcmp (from->to_file, file))
+   if (ORDINARY_MAP_SOURCE_ID (from) == file)
  {
last_include_ord_map = from;
last_ord_map_after_include = NULL;
  }
 
-  /* Likewise, use strcmp, and reject any line-zero introductory
-map.  */
-  if (ord_map->to_line && 0 == strcmp (ord_map->to_file, file))
+  /* Likewise, use strcmp (via the source_id comparison), and reject any
+line-zero introductory map.  */
+  if (ord_map->to_line && ORDINARY_MAP_SOURCE_ID (ord_map) == file)
{
  if (!first_ord_map_in_file)
first_ord_map_in_file = ord_map;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index ea362bdffa4..ff17cd57016 100644
--- 

[PATCH v4 2/8] libcpp: diagnostics: Support generated data in expanded locations

2023-08-09 Thread Lewis Hyatt via Gcc-patches
The previous patch in this series introduced the concept of LC_GEN line
maps. This patch continues on the path to using them to improve _Pragma
diagnostics, by adding a new source_id SRC member to struct
expanded_location, which is populated by linemap_expand_location. This
member allows call sites to detect and handle when a location refers to
generated data rather than a plain file name.

The previous FILE member of expanded_location is preserved (although
redundant with SRC), so that call sites which do not and never will care
about generated data do not need to be concerned about it. Call sites that
will care are modified here, to use SRC rather than FILE for comparing
locations.

libcpp/ChangeLog:

* include/line-map.h (struct expanded_location): Add SRC member. Add
zero-initializers for all members, since source_id is not a POD
type.
(class fixit_hint): Adjust prototype.
* line-map.cc (linemap_expand_location): Populate the new SRC member
in the expanded_location.
(rich_location::maybe_add_fixit): Compare explocs with the new SRC
field instead of the FILE field.
(fixit_hint::affects_line_p): Accept a source_id instead of a file
name, and use it for the comparisons.

gcc/c-family/ChangeLog:

* c-format.cc (get_corrected_substring): Compare explocs with the
new SRC field instead of the FILE field.
* c-indentation.cc (should_warn_for_misleading_indentation): Likewise.
(assert_get_visual_column_succeeds): Initialize the SRC field in the
test expanded_location.
(assert_get_visual_column_fails): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (make_range): Adapt to the new
constructor semantics for struct expanded_location.
(layout::maybe_add_location_range): Compare explocs with the new SRC
field instead of the FILE field.
(layout::validate_fixit_hint_p): Likewise.
(layout::print_leading_fixits): Use the SRC field in struct
expanded_location to query fixit_hint::affects_line_p.
(layout::print_trailing_fixits): Likewise.
* diagnostic.cc (diagnostic_report_current_module): Use the new SRC
field in expanded_location to detect LC_GEN locations and identify
them as such.
(assert_location_text): Adapt to the new constructor semantics for
struct expanded_location.
* input.cc (expand_location_1): Likewise. And when libcpp's
linemap_expand_location returns a null FILE for generated data,
replace it with special_fname_generated ().
(total_lines_num): Handle a generic source_id argument rather than a
file name only.
(get_source_text_between): Compare explocs with the new SRC field
instead of the FILE field.
(get_substring_ranges_for_loc): Likewise.
* edit-context.cc (edit_context::apply_fixit): Ignore locations in
generated data.
* input.h (LOCATION_SRC): New accessor macro.
---
 gcc/c-family/c-format.cc  |  4 ++--
 gcc/c-family/c-indentation.cc | 10 +-
 gcc/diagnostic-show-locus.cc  | 30 +-
 gcc/diagnostic.cc | 19 ---
 gcc/edit-context.cc   |  2 +-
 gcc/input.cc  | 21 +++--
 gcc/input.h   |  1 +
 libcpp/include/line-map.h | 24 ++--
 libcpp/line-map.cc| 15 +++
 9 files changed, 70 insertions(+), 56 deletions(-)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index b4eeebcb30e..529b1408179 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4522,9 +4522,9 @@ get_corrected_substring (const substring_loc _loc,
 = expand_location_to_spelling_point (fmt_substring_range.m_start);
   expanded_location finish
 = expand_location_to_spelling_point (fmt_substring_range.m_finish);
-  if (caret.file != start.file)
+  if (caret.src != start.src)
 return NULL;
-  if (start.file != finish.file)
+  if (start.src != finish.src)
 return NULL;
   if (caret.line != start.line)
 return NULL;
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index e8d3dece770..fce74991aae 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -334,7 +334,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info _tinfo,
   const unsigned int tab_width = global_dc->tabstop;
 
   /* They must be in the same file.  */
-  if (next_stmt_exploc.file != body_exploc.file)
+  if (next_stmt_exploc.src != body_exploc.src)
 return false;
 
   /* If NEXT_STMT_LOC and BODY_LOC are on the same line, consider
@@ -363,7 +363,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info _tinfo,
   ^ DON'T WARN HERE.  */
   if (next_stmt_exploc.line == body_exploc.line)
 {
-  if (guard_exploc.file != 

[PATCH v4 0/8] diagnostics: libcpp: Overhaul locations for _Pragma tokens

2023-08-09 Thread Lewis Hyatt via Gcc-patches
On Mon, Jul 31, 2023 at 06:39:15PM -0400, Lewis Hyatt wrote:
> On Fri, Jul 28, 2023 at 6:58 PM David Malcolm  wrote:
> >
> > On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > > Add a new linemap reason LC_GEN which enables encoding the location
> > > of data
> > > that was generated during compilation and does not appear in any
> > > source file.
> > > There could be many use cases, such as, for instance, referring to
> > > the content
> > > of builtin macros (not yet implemented, but an easy lift after this
> > > one.) The
> > > first intended application is to create a place to store the input to
> > > a
> > > _Pragma directive, so that proper locations can be assigned to those
> > > tokens. This will be done in a subsequent commit.
> > >
> > > The actual change needed to the line-maps API in libcpp is not too
> > > large and
> > > requires no space overhead in the line map data structures (on 64-bit
> > > systems
> > > that is; one newly added data member to class line_map_ordinary sits
> > > inside
> > > former padding bytes.) An LC_GEN map is just an ordinary map like any
> > > other,
> > > but the TO_FILE member that normally points to the file name points
> > > instead to
> > > the actual data.  This works automatically with PCH as well, for the
> > > same
> > > reason that the file name makes its way into a PCH.  In order to
> > > avoid
> > > confusion, the member has been renamed from TO_FILE to DATA, and
> > > associated
> > > accessors adjusted.
> > >
> > > Outside libcpp, there are many small changes but most of them are to
> > > selftests, which are necessarily more sensitive to implementation
> > > details. From the perspective of the user (the "user", here, being a
> > > frontend
> > > using line maps or else the diagnostics infrastructure), the chief
> > > visible
> > > change is that the function location_get_source_line() should be
> > > passed an
> > > expanded_location object instead of a separate filename and line
> > > number.  This
> > > is not a big change because in most cases, this information came
> > > anyway from a
> > > call to expand_location and the needed expanded_location object is
> > > readily
> > > available. The new overload of location_get_source_line() uses the
> > > extra
> > > information in the expanded_location object to obtain the data from
> > > the
> > > in-memory buffer when it originated from an LC_GEN map.
> > >
> > > Until the subsequent patch that starts using LC_GEN maps, none are
> > > yet
> > > generated within GCC, hence nothing is added to the testsuite here;
> > > but all
> > > relevant selftests have been extended to cover generated data maps in
> > > addition
> > > to normal files.
> >
> > [..snip...]
> >
> > Thanks for the updated patch.
> >
> > Reading this patch, it felt a bit unnatural to me to have an
> >   (exploded location, source line)
> > pair where the exploded location seems to be representing "which source
> > file or generated buffer", but the line/column info in that
> > exploded_location is to be ignored in favor of the 2nd source line.
> >
> > I think we're missing a class: something that identifies either a
> > specific source file, or a specific generated buffer.
> >
> > How about something like either:
> >
> > class source_id
> > {
> > public:
> >   source_id (const char *filename)
> >   : m_filename_or_buffer (filename),
> > m_len (0)
> >   {
> >   }
> >
> >   explicit source_id (const char *buffer, unsigned buffer_len)
> >   : m_filename_or_buffer (buffer),
> > m_len (buffer_len)
> >   {
> > linemap_assert (buffer_len > 0);
> >   }
> >
> > private:
> >   const char *m_filename_or_buffer;
> >   unsigned m_len;  // where 0 means "it's a filename"
> > };
> >
> > or:
> >
> > class source_id
> > {
> > public:
> >   source_id (const char *filename)
> >   : m_ptr (filename),
> > m_is_buffer (false)
> >   {
> >   }
> >
> >   explicit source_id (const linemap_ordinary *buffer_linemap)
> >   : m_ptr (buffer_linemap),
> > m_is_buffer (true)
> >   {
> >   }
> >
> > private:
> >   const void *m_ptr;
> >   b

Re: [PATCH] preprocessor: c++: Support `#pragma GCC target' macros [PR87299]

2023-08-09 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 1, 2023 at 11:01 AM Joseph Myers  wrote:
>
> On Mon, 31 Jul 2023, Lewis Hyatt via Gcc-patches wrote:
>
> > I added some additional testcases from the PR for x86. The other targets
> > that support `#pragma GCC target' (aarch64, arm, nios2, powerpc, s390)
> > already had tests verifying that the pragma sets macros as expected; here I
> > have added -save-temps to some of them, to test that it now works in
> > preprocess-only mode as well.
>
> It would seem better to have copies of the tests with and without
> -save-temps, to test in both modes, rather than changing what's tested by
> an existing test here.  Or a test variant that #includes the original test
> but uses different options, if the original test isn't doing anything that
> would fail to work with that approach.

Thank you, I will adjust this.

-Lewis


[PATCH] preprocessor: c++: Support `#pragma GCC target' macros [PR87299]

2023-07-31 Thread Lewis Hyatt via Gcc-patches
`#pragma GCC target' is not currently handled in preprocess-only mode (e.g.,
when running gcc -E or gcc -save-temps). As noted in the PR, this means that
if the target pragma defines any macros, those macros are not effective in
preprocess-only mode. Similarly, such macros are not effective when
compiling with C++ (even when compiling without -save-temps), because C++
does not process the pragma until after all tokens have been obtained from
libcpp, at which point it is too late for macro expansion to take place.

Since r13-1544 and r14-2893, there is a general mechanism to handle pragmas
under these conditions as well, so resolve the PR by using the new "early
pragma" support.

toplev.cc required some changes because the target-specific handlers for
`#pragma GCC target' may call target_reinit(), and toplev.cc was not expecting
that function to be called in preprocess-only mode.

I added some additional testcases from the PR for x86. The other targets
that support `#pragma GCC target' (aarch64, arm, nios2, powerpc, s390)
already had tests verifying that the pragma sets macros as expected; here I
have added -save-temps to some of them, to test that it now works in
preprocess-only mode as well.

gcc/c-family/ChangeLog:

PR preprocessor/87299
* c-pragma.cc (init_pragma): Register `#pragma GCC target' and
related pragmas in preprocess-only mode, and enable early handling.
(c_reset_target_pragmas): New function refactoring code from...
(handle_pragma_reset_options): ...here.
* c-pragma.h (c_reset_target_pragmas): Declare.

gcc/cp/ChangeLog:

PR preprocessor/87299
* parser.cc (cp_lexer_new_main): Call c_reset_target_pragmas ()
after preprocessing is complete, before starting compilation.

gcc/ChangeLog:

PR preprocessor/87299
* toplev.cc (no_backend): New static global.
(finalize): Remove argument no_backend, which is now a
static global.
(process_options): Likewise.
(do_compile): Likewise.
(target_reinit): Don't do anything in preprocess-only mode.
(toplev::main): Adapt to no_backend change.
(toplev::finalize): Likewise.

gcc/testsuite/ChangeLog:

PR preprocessor/87299
* c-c++-common/pragma-target-1.c: New test.
* c-c++-common/pragma-target-2.c: New test.
* g++.target/i386/pr87299-1.C: New test.
* g++.target/i386/pr87299-2.C: New test.
* gcc.target/i386/pr87299-1.c: New test.
* gcc.target/i386/pr87299-2.c: New test.
* gcc.target/s390/target-attribute/tattr-2.c: Add -save-temps to the
options, to test preprocess-only mode as well.
* gcc.target/aarch64/pragma_cpp_predefs_1.c: Likewise.
* gcc.target/arm/pragma_arch_attribute.c: Likewise.
* gcc.target/nios2/custom-fp-2.c: Likewise.
* gcc.target/powerpc/float128-3.c: Likewise.
---

Notes:
Hello-

This patch fixes the PR by enabling early pragma handling for `#pragma GCC
target' and related pragmas such as `#pragma GCC push_options'. I did not
need to touch any target-specific code, however I did need to make a change
to toplev.cc, affecting all targets, to make it safe to call target_reinit()
in preprocess-only mode. (Otherwise, it would be necessary to modify the
implementation of target pragmas in every target, to avoid this code path.)
That was the only complication I ran into.

Regarding testing, I did: (thanks to GCC compile farm for the non-x86
targets)

bootstrap + regtest all languages - x86_64-pc-linux-gnu
bootstrap + regtest c/c++ - powerpc64le-unknown-linux-gnu,
aarch64-unknown-linux-gnu

The following backends also implement this pragma so ought to be tested:
arm
nios2
s390

I am not able to test those directly. I did add coverage to their testsuites
(basically, adding -save-temps to any existing test, causes it to test the
pragma in preprocess-only mode.) Then, I verified on x86_64 with a cross
compiler, that the modified testcases fail before the patch and pass
afterwards. nios2 is an exception, it does not set any libcpp macros when
handling the pragma, so there is nothing to test, but I did verify that
processing the pragma in preprocess-only mode does not cause any problems.
The cross compilers tested were targets arm-unknown-linux-gnueabi,
nios2-unknown-linux, and s390-ibm-linux.

Please let me know if it looks OK? Thanks!

-Lewis

 gcc/c-family/c-pragma.cc  | 49 ---
 gcc/c-family/c-pragma.h   |  2 +-
 gcc/cp/parser.cc  |  6 +++
 gcc/testsuite/c-c++-common/pragma-target-1.c  | 19 +++
 gcc/testsuite/c-c++-common/pragma-target-2.c  | 27 ++
 gcc/testsuite/g++.target/i386/pr87299-1.C |  8 +++
 

Re: [PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-07-31 Thread Lewis Hyatt via Gcc-patches
On Fri, Jul 28, 2023 at 6:58 PM David Malcolm  wrote:
>
> On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > Add a new linemap reason LC_GEN which enables encoding the location
> > of data
> > that was generated during compilation and does not appear in any
> > source file.
> > There could be many use cases, such as, for instance, referring to
> > the content
> > of builtin macros (not yet implemented, but an easy lift after this
> > one.) The
> > first intended application is to create a place to store the input to
> > a
> > _Pragma directive, so that proper locations can be assigned to those
> > tokens. This will be done in a subsequent commit.
> >
> > The actual change needed to the line-maps API in libcpp is not too
> > large and
> > requires no space overhead in the line map data structures (on 64-bit
> > systems
> > that is; one newly added data member to class line_map_ordinary sits
> > inside
> > former padding bytes.) An LC_GEN map is just an ordinary map like any
> > other,
> > but the TO_FILE member that normally points to the file name points
> > instead to
> > the actual data.  This works automatically with PCH as well, for the
> > same
> > reason that the file name makes its way into a PCH.  In order to
> > avoid
> > confusion, the member has been renamed from TO_FILE to DATA, and
> > associated
> > accessors adjusted.
> >
> > Outside libcpp, there are many small changes but most of them are to
> > selftests, which are necessarily more sensitive to implementation
> > details. From the perspective of the user (the "user", here, being a
> > frontend
> > using line maps or else the diagnostics infrastructure), the chief
> > visible
> > change is that the function location_get_source_line() should be
> > passed an
> > expanded_location object instead of a separate filename and line
> > number.  This
> > is not a big change because in most cases, this information came
> > anyway from a
> > call to expand_location and the needed expanded_location object is
> > readily
> > available. The new overload of location_get_source_line() uses the
> > extra
> > information in the expanded_location object to obtain the data from
> > the
> > in-memory buffer when it originated from an LC_GEN map.
> >
> > Until the subsequent patch that starts using LC_GEN maps, none are
> > yet
> > generated within GCC, hence nothing is added to the testsuite here;
> > but all
> > relevant selftests have been extended to cover generated data maps in
> > addition
> > to normal files.
>
> [..snip...]
>
> Thanks for the updated patch.
>
> Reading this patch, it felt a bit unnatural to me to have an
>   (exploded location, source line)
> pair where the exploded location seems to be representing "which source
> file or generated buffer", but the line/column info in that
> exploded_location is to be ignored in favor of the 2nd source line.
>
> I think we're missing a class: something that identifies either a
> specific source file, or a specific generated buffer.
>
> How about something like either:
>
> class source_id
> {
> public:
>   source_id (const char *filename)
>   : m_filename_or_buffer (filename),
> m_len (0)
>   {
>   }
>
>   explicit source_id (const char *buffer, unsigned buffer_len)
>   : m_filename_or_buffer (buffer),
> m_len (buffer_len)
>   {
> linemap_assert (buffer_len > 0);
>   }
>
> private:
>   const char *m_filename_or_buffer;
>   unsigned m_len;  // where 0 means "it's a filename"
> };
>
> or:
>
> class source_id
> {
> public:
>   source_id (const char *filename)
>   : m_ptr (filename),
> m_is_buffer (false)
>   {
>   }
>
>   explicit source_id (const linemap_ordinary *buffer_linemap)
>   : m_ptr (buffer_linemap),
> m_is_buffer (true)
>   {
>   }
>
> private:
>   const void *m_ptr;
>   bool m_is_buffer;
> };
>
> and use one of these "source_id file" in place of "const char *file",
> rather than replacing such things with expanded_location?
>
> > diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
> > index e8d3dece770..4164fa0b1ba 100644
> > --- a/gcc/c-family/c-indentation.cc
> > +++ b/gcc/c-family/c-indentation.cc
> > @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
> >  unsigned int *first_nws,
> >  unsigned int tab_width)
> >  {
> > -  char_span line = location_get_source_

Re: [PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens

2023-07-29 Thread Lewis Hyatt via Gcc-patches
On Fri, Jul 28, 2023 at 6:22 PM David Malcolm  wrote:
>
> On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > This is an update to the v2 patch series last sent in January:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html
> >
> > While I did not receive any feedback on the v2 patches yet, they did
> > need some
> > rebasing on top of other recent commits to input.cc, so I thought it
> > would be
> > helpful to send them again now. The patches have not otherwise
> > changed from
> > v2, and the above-linked message explains how all the patches fit in
> > with the
> > original v1 series sent last November.
> >
> > Dave, I would appreciate it very much if you could please let me know
> > what you
> > think of this approach? I feel like the diagnostics we currently
> > output for _Pragmas are worth improving. As a reminder, say for this
> > example:
> >
> > =
> >  #define S "GCC diagnostic ignored \"oops"
> >  _Pragma(S)
> > =
> >
> > We currently output:
> >
> > =
> > file.cpp:2:24: warning: missing terminating " character
> > 2 | _Pragma(S)
> >   |^
> > =
> >
> > While after these patches, we would output:
> >
> > ==
> > :1:24: warning: missing terminating " character
> > 1 | GCC diagnostic ignored "oops
> >   |^
> > file.cpp:2:1: note: in <_Pragma directive>
> > 2 | _Pragma(S)
> >   | ^~~
> > ==
> >
> > Thanks!
>
> Hi Lewis; sorry for not responding to the v2 patches.
>
> I've started looking at the v3 patches in detail, but I have some high-
> level questions about memory usage:
>
> Am I right in thinking that the effect of this patch is that for every
> _Pragma in the source we will create a new line_map_ordinary, and a new
> buffer for the stringified content of that _Pragma, and that these
> allocations will persist for the rest of the compilation?  (plus a
> little extra allocation within the "location_t" space from 0 to
> 0x7fff).
>
> It sounds like this will probably be a rounding error that won't be
> noticable in profiling, but did you attempt any such measurement of the
> memory usage before/after this patch on some real-world projects?
>
> Thanks
> Dave
>

Thanks for looking at the patches, I appreciate it whenever you have
time to get to them.

This is a fair point about the memory usage, basically it means that
each instance of a _Pragma has comparable memory footprint to a macro
definition. (In addition to the overheads you mentioned, it also
creates a macro map to generate a virtual location for the tokens, so
that it's able to output the "in expansion of _Pragma" note. That part
can be disabled with -ftrack-macro-expansion=0 at least.)

I had the sense that _Pragma isn't used often enough for that to be a
problem, but agreed it is worth checking. (I really hope this memory
usage isn't an issue since there are also numerous PRs complaining
about 32-bit limitations in location tracking, that make it tempting
to explore 64-bit line maps or some other option someday too.)

I tried one thing now, wxWidgets uses a lot of diagnostic pragmas
wrapped up inside macros that use _Pragma. (See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578). The testsuite
contains a file allheaders.cpp which includes the whole library, so I
tried compiling this into a pch, which I believe measures the entire
memory footprint including the ordinary and macro line maps and the
_Pragma strings. The resulting PCH sizes were:

279000173 bytes before the changes
279491345 bytes after the changes

So 0.1% bigger. Happy to check other projects too, do you have any
standard gotos? Maybe firefox or something I take it.

I see your other response on patch #1, I am thinking about that and
will reply later. Thanks again!

-Lewis


Re: [PATCH v2] c-family: Implement pragma_lex () for preprocess-only mode

2023-07-28 Thread Lewis Hyatt via Gcc-patches
On Thu, Jul 27, 2023 at 06:18:33PM -0700, Jason Merrill wrote:
> On 7/27/23 18:59, Lewis Hyatt wrote:
> > In order to support processing #pragma in preprocess-only mode (-E or
> > -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from
> > libcpp. In full compilation modes, this is accomplished by calling
> > pragma_lex (), which is a symbol that must be exported by the frontend, and
> > which is currently implemented for C and C++. Neither of those frontends
> > initializes its parser machinery in preprocess-only mode, and consequently
> > pragma_lex () does not work in this case.
> > 
> > Address that by adding a new function c_init_preprocess () for the frontends
> > to implement, which arranges for pragma_lex () to work in preprocess-only
> > mode, and adjusting pragma_lex () accordingly.
> > 
> > In preprocess-only mode, the preprocessor is accustomed to controlling the
> > interaction with libcpp, and it only knows about tokens that it has called
> > into libcpp itself to obtain. Since it still needs to see the tokens
> > obtained by pragma_lex () so that they can be streamed to the output, also
> > adjust c_lex_with_flags () and related functions in c-family/c-lex.cc to
> > inform the preprocessor about any tokens it won't be aware of.
> > 
> > Currently, there is one place where we are already supporting #pragma in
> > preprocess-only mode, namely the handling of `#pragma GCC diagnostic'.  That
> > was done by directly interfacing with libcpp, rather than making use of
> > pragma_lex (). Now that pragma_lex () works, that code is no longer
> > necessary; remove it.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c-common.h (c_init_preprocess): Declare.
> > (c_lex_enable_token_streaming): Declare.
> > * c-opts.cc (c_common_init): Call c_init_preprocess ().
> > * c-lex.cc (stream_tokens_to_preprocessor): New static variable.
> > (c_lex_enable_token_streaming): New function.
> > (cb_def_pragma): Add a comment.
> > (get_token): New function wrapping cpp_get_token.
> > (c_lex_with_flags): Use the new wrapper function to support
> > obtaining tokens in preprocess_only mode.
> > (lex_string): Likewise.
> > * c-ppoutput.cc (preprocess_file): Call c_lex_enable_token_streaming
> > when needed.
> > * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to...
> > (pragma_diagnostic_lex): ...this.
> > (pragma_diagnostic_lex_pp): Remove.
> > (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in
> > all modes.
> > (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex ()
> > usage.
> > * c-pragma.h (pragma_lex_discard_to_eol): Declare.
> > 
> > gcc/c/ChangeLog:
> > 
> > * c-parser.cc (pragma_lex_discard_to_eol): New function.
> > (c_init_preprocess): New function.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * parser.cc (c_init_preprocess): New function.
> > (maybe_read_tokens_for_pragma_lex): New function.
> > (pragma_lex): Support preprocess-only mode.
> > (pragma_lex_discard_to_eol): New function.
> > ---
> > 
> > Notes:
> >  Hello-
> >  Here is version 2 of the patch, incorporating Jason's feedback from
> >  https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625591.html
> >  Thanks again, please let me know if it's OK? Bootstrap + regtest all
> >  languages on x86-64 Linux looks good.
> >  -Lewis
> > 
> >   gcc/c-family/c-common.h|  4 +++
> >   gcc/c-family/c-lex.cc  | 49 +
> >   gcc/c-family/c-opts.cc |  1 +
> >   gcc/c-family/c-ppoutput.cc | 17 +---
> >   gcc/c-family/c-pragma.cc   | 56 ++
> >   gcc/c-family/c-pragma.h|  2 ++
> >   gcc/c/c-parser.cc  | 21 ++
> >   gcc/cp/parser.cc   | 45 ++
> >   8 files changed, 138 insertions(+), 57 deletions(-)
> > 
> > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> > index b5ef5ff6b2c..2fe2f194660 100644
> > --- a/gcc/c-family/c-common.h
> > +++ b/gcc/c-family/c-common.h
> > @@ -990,6 +990,9 @@ extern void c_parse_file (void);
> >   extern void c_parse_final_cleanups (void);
> > +/* This initializes for preprocess-only mode.  */
> > +extern void c_init_preprocess (void);
> > +
> >   /* These macros provide convenient access to the various _STMT nodes.  */
> >   /* Nonzero if a given STATEMENT_LIST represents the outermost 

[PATCH v2] c-family: Implement pragma_lex () for preprocess-only mode

2023-07-27 Thread Lewis Hyatt via Gcc-patches
In order to support processing #pragma in preprocess-only mode (-E or
-save-temps for gcc/g++), we need a way to obtain the #pragma tokens from
libcpp. In full compilation modes, this is accomplished by calling
pragma_lex (), which is a symbol that must be exported by the frontend, and
which is currently implemented for C and C++. Neither of those frontends
initializes its parser machinery in preprocess-only mode, and consequently
pragma_lex () does not work in this case.

Address that by adding a new function c_init_preprocess () for the frontends
to implement, which arranges for pragma_lex () to work in preprocess-only
mode, and adjusting pragma_lex () accordingly.

In preprocess-only mode, the preprocessor is accustomed to controlling the
interaction with libcpp, and it only knows about tokens that it has called
into libcpp itself to obtain. Since it still needs to see the tokens
obtained by pragma_lex () so that they can be streamed to the output, also
adjust c_lex_with_flags () and related functions in c-family/c-lex.cc to
inform the preprocessor about any tokens it won't be aware of.

Currently, there is one place where we are already supporting #pragma in
preprocess-only mode, namely the handling of `#pragma GCC diagnostic'.  That
was done by directly interfacing with libcpp, rather than making use of
pragma_lex (). Now that pragma_lex () works, that code is no longer
necessary; remove it.

gcc/c-family/ChangeLog:

* c-common.h (c_init_preprocess): Declare.
(c_lex_enable_token_streaming): Declare.
* c-opts.cc (c_common_init): Call c_init_preprocess ().
* c-lex.cc (stream_tokens_to_preprocessor): New static variable.
(c_lex_enable_token_streaming): New function.
(cb_def_pragma): Add a comment.
(get_token): New function wrapping cpp_get_token.
(c_lex_with_flags): Use the new wrapper function to support
obtaining tokens in preprocess_only mode.
(lex_string): Likewise.
* c-ppoutput.cc (preprocess_file): Call c_lex_enable_token_streaming
when needed.
* c-pragma.cc (pragma_diagnostic_lex_normal): Rename to...
(pragma_diagnostic_lex): ...this.
(pragma_diagnostic_lex_pp): Remove.
(handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in
all modes.
(c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex ()
usage.
* c-pragma.h (pragma_lex_discard_to_eol): Declare.

gcc/c/ChangeLog:

* c-parser.cc (pragma_lex_discard_to_eol): New function.
(c_init_preprocess): New function.

gcc/cp/ChangeLog:

* parser.cc (c_init_preprocess): New function.
(maybe_read_tokens_for_pragma_lex): New function.
(pragma_lex): Support preprocess-only mode.
(pragma_lex_discard_to_eol): New function.
---

Notes:
Hello-

Here is version 2 of the patch, incorporating Jason's feedback from
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625591.html

Thanks again, please let me know if it's OK? Bootstrap + regtest all
languages on x86-64 Linux looks good.

-Lewis

 gcc/c-family/c-common.h|  4 +++
 gcc/c-family/c-lex.cc  | 49 +
 gcc/c-family/c-opts.cc |  1 +
 gcc/c-family/c-ppoutput.cc | 17 +---
 gcc/c-family/c-pragma.cc   | 56 ++
 gcc/c-family/c-pragma.h|  2 ++
 gcc/c/c-parser.cc  | 21 ++
 gcc/cp/parser.cc   | 45 ++
 8 files changed, 138 insertions(+), 57 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b5ef5ff6b2c..2fe2f194660 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -990,6 +990,9 @@ extern void c_parse_file (void);
 
 extern void c_parse_final_cleanups (void);
 
+/* This initializes for preprocess-only mode.  */
+extern void c_init_preprocess (void);
+
 /* These macros provide convenient access to the various _STMT nodes.  */
 
 /* Nonzero if a given STATEMENT_LIST represents the outermost binding
@@ -1214,6 +1217,7 @@ extern tree c_build_bind_expr (location_t, tree, tree);
 /* In c-lex.cc.  */
 extern enum cpp_ttype
 conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind);
+extern void c_lex_enable_token_streaming (bool enabled);
 
 /* In c-pch.cc  */
 extern void pch_init (void);
diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index dcd061c7cb1..ac4c018d863 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -57,6 +57,17 @@ static void cb_ident (cpp_reader *, unsigned int, const 
cpp_string *);
 static void cb_def_pragma (cpp_reader *, unsigned int);
 static void cb_define (cpp_reader *, unsigned int, cpp_hashnode *);
 static void cb_undef (cpp_reader *, unsigned int, cpp_hashnode *);
+
+/* Flag to remember if we are in a mode (such as flag_preprocess_only) in which
+   tokens obtained here need to be streamed to the preprocessor.  */

Re: [PATCH] c-family: Implement pragma_lex () for preprocess-only mode

2023-07-26 Thread Lewis Hyatt via Gcc-patches
On Wed, Jul 26, 2023 at 5:36 PM Jason Merrill  wrote:
>
> On 6/30/23 18:59, Lewis Hyatt wrote:
> > In order to support processing #pragma in preprocess-only mode (-E or
> > -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from
> > libcpp. In full compilation modes, this is accomplished by calling
> > pragma_lex (), which is a symbol that must be exported by the frontend, and
> > which is currently implemented for C and C++. Neither of those frontends
> > initializes its parser machinery in preprocess-only mode, and consequently
> > pragma_lex () does not work in this case.
> >
> > Address that by adding a new function c_init_preprocess () for the frontends
> > to implement, which arranges for pragma_lex () to work in preprocess-only
> > mode, and adjusting pragma_lex () accordingly.
> >
> > In preprocess-only mode, the preprocessor is accustomed to controlling the
> > interaction with libcpp, and it only knows about tokens that it has called
> > into libcpp itself to obtain. Since it still needs to see the tokens
> > obtained by pragma_lex () so that they can be streamed to the output, also
> > add a new libcpp callback, on_token_lex (), that ensures the preprocessor
> > sees these tokens too.
> >
> > Currently, there is one place where we are already supporting #pragma in
> > preprocess-only mode, namely the handling of `#pragma GCC diagnostic'.  That
> > was done by directly interfacing with libcpp, rather than making use of
> > pragma_lex (). Now that pragma_lex () works, that code is no longer
> > necessary; remove it.
> >
> > gcc/c-family/ChangeLog:
> >
> >   * c-common.h (c_init_preprocess): Declare new function.
> >   * c-opts.cc (c_common_init): Call it.
> >   * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to...
> >   (pragma_diagnostic_lex): ...this.
> >   (pragma_diagnostic_lex_pp): Remove.
> >   (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in
> >   all modes.
> >   (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex ()
> >   usage.
> >   * c-pragma.h (pragma_lex_discard_to_eol): Declare new function.
> >
> > gcc/c/ChangeLog:
> >
> >   * c-parser.cc (pragma_lex): Support preprocess-only mode.
> >   (pragma_lex_discard_to_eol): New function.
> >   (c_init_preprocess): New function.
> >
> > gcc/cp/ChangeLog:
> >
> >   * parser.cc (c_init_preprocess): New function.
> >   (maybe_read_tokens_for_pragma_lex): New function.
> >   (pragma_lex): Support preprocess-only mode.
> >   (pragma_lex_discard_to_eol): New funtion.
> >
> > libcpp/ChangeLog:
> >
> >   * include/cpplib.h (struct cpp_callbacks): Add new callback
> >   on_token_lex.
> >   * macro.cc (cpp_get_token_1): Support new callback.
> > ---
> >
> > Notes:
> >  Hello-
> >
> >  In r13-1544, I added support for processing `#pragma GCC diagnostic' in
> >  preprocess-only mode. Because pragma_lex () doesn't work in that mode, 
> > in
> >  that patch I called into libcpp directly to obtain the tokens needed to
> >  process the pragma. As part of the review, Jason noted that it would
> >  probably be better to make pragma_lex () usable in preprocess-only 
> > mode, and
> >  we decided just to add a comment about that for the time being, and to 
> > go
> >  ahead and implement that in the future, if it became necessary to 
> > support
> >  other pragmas during preprocessing.
> >
> >  I think now is a good time to proceed with that plan, because I would 
> > like
> >  to fix PR87299, which is about another pragma (#pragma GCC target) not
> >  working in preprocess-only mode. This patch makes the necessary 
> > changes for
> >  pragma_lex () to work in preprocess-only mode.
> >
> >  I have also added a new callback, on_token_lex (), to libcpp. This is 
> > so the
> >  preprocessor can see and stream out all the tokens that pragma_lex () 
> > gets
> >  from libcpp, since it won't otherwise see them.  This seemed the 
> > simplest
> >  approach to me. Another possibility would be to add a wrapper function 
> > in
> >  c-family/c-lex.cc, which would call cpp_get_token_with_location(), and 
> > then
> >  also stream the token in preprocess-only mode, and then change all 
> > calls
> >  into libcpp in that file to use the wrapper function.  The libcpp 
> > callback
> >  seemed cleaner to me FWIW.
>
> I think the other way sounds better to me; there are only three calls to
> cpp_get_... in c_lex_with_flags.
>
> The rest of the patch looks good.

Thank you very much for the feedback. I will test it this way and send
the updated version.

-Lewis


Re: [PATCH] c-family: Implement pragma_lex () for preprocess-only mode

2023-07-26 Thread Lewis Hyatt via Gcc-patches
May I please ping this?
I am just about ready with the followup patch that fixes PR87299, but
it depends on this one. Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623364.html

-Lewis

On Fri, Jun 30, 2023 at 6:59 PM Lewis Hyatt  wrote:
>
> In order to support processing #pragma in preprocess-only mode (-E or
> -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from
> libcpp. In full compilation modes, this is accomplished by calling
> pragma_lex (), which is a symbol that must be exported by the frontend, and
> which is currently implemented for C and C++. Neither of those frontends
> initializes its parser machinery in preprocess-only mode, and consequently
> pragma_lex () does not work in this case.
>
> Address that by adding a new function c_init_preprocess () for the frontends
> to implement, which arranges for pragma_lex () to work in preprocess-only
> mode, and adjusting pragma_lex () accordingly.
>
> In preprocess-only mode, the preprocessor is accustomed to controlling the
> interaction with libcpp, and it only knows about tokens that it has called
> into libcpp itself to obtain. Since it still needs to see the tokens
> obtained by pragma_lex () so that they can be streamed to the output, also
> add a new libcpp callback, on_token_lex (), that ensures the preprocessor
> sees these tokens too.
>
> Currently, there is one place where we are already supporting #pragma in
> preprocess-only mode, namely the handling of `#pragma GCC diagnostic'.  That
> was done by directly interfacing with libcpp, rather than making use of
> pragma_lex (). Now that pragma_lex () works, that code is no longer
> necessary; remove it.
>
> gcc/c-family/ChangeLog:
>
> * c-common.h (c_init_preprocess): Declare new function.
> * c-opts.cc (c_common_init): Call it.
> * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to...
> (pragma_diagnostic_lex): ...this.
> (pragma_diagnostic_lex_pp): Remove.
> (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in
> all modes.
> (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex ()
> usage.
> * c-pragma.h (pragma_lex_discard_to_eol): Declare new function.
>
> gcc/c/ChangeLog:
>
> * c-parser.cc (pragma_lex): Support preprocess-only mode.
> (pragma_lex_discard_to_eol): New function.
> (c_init_preprocess): New function.
>
> gcc/cp/ChangeLog:
>
> * parser.cc (c_init_preprocess): New function.
> (maybe_read_tokens_for_pragma_lex): New function.
> (pragma_lex): Support preprocess-only mode.
> (pragma_lex_discard_to_eol): New funtion.
>
> libcpp/ChangeLog:
>
> * include/cpplib.h (struct cpp_callbacks): Add new callback
> on_token_lex.
> * macro.cc (cpp_get_token_1): Support new callback.
> ---
>
> Notes:
> Hello-
>
> In r13-1544, I added support for processing `#pragma GCC diagnostic' in
> preprocess-only mode. Because pragma_lex () doesn't work in that mode, in
> that patch I called into libcpp directly to obtain the tokens needed to
> process the pragma. As part of the review, Jason noted that it would
> probably be better to make pragma_lex () usable in preprocess-only mode, 
> and
> we decided just to add a comment about that for the time being, and to go
> ahead and implement that in the future, if it became necessary to support
> other pragmas during preprocessing.
>
> I think now is a good time to proceed with that plan, because I would like
> to fix PR87299, which is about another pragma (#pragma GCC target) not
> working in preprocess-only mode. This patch makes the necessary changes 
> for
> pragma_lex () to work in preprocess-only mode.
>
> I have also added a new callback, on_token_lex (), to libcpp. This is so 
> the
> preprocessor can see and stream out all the tokens that pragma_lex () gets
> from libcpp, since it won't otherwise see them.  This seemed the simplest
> approach to me. Another possibility would be to add a wrapper function in
> c-family/c-lex.cc, which would call cpp_get_token_with_location(), and 
> then
> also stream the token in preprocess-only mode, and then change all calls
> into libcpp in that file to use the wrapper function.  The libcpp callback
> seemed cleaner to me FWIW.
>
> There are no new tests added here, since it's just a change of
> implementation covered by existing tests. Bootstrap + regtest all 
> languages
> looks good on x86-64 Linux.
>
> Please let me know what you think? Thanks!
>
> -Lewis
>
>  gcc/c-family/c-common.h  |  3 +++
>  g

[PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-07-21 Thread Lewis Hyatt via Gcc-patches
Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The actual change needed to the line-maps API in libcpp is not too large and
requires no space overhead in the line map data structures (on 64-bit systems
that is; one newly added data member to class line_map_ordinary sits inside
former padding bytes.) An LC_GEN map is just an ordinary map like any other,
but the TO_FILE member that normally points to the file name points instead to
the actual data.  This works automatically with PCH as well, for the same
reason that the file name makes its way into a PCH.  In order to avoid
confusion, the member has been renamed from TO_FILE to DATA, and associated
accessors adjusted.

Outside libcpp, there are many small changes but most of them are to
selftests, which are necessarily more sensitive to implementation
details. From the perspective of the user (the "user", here, being a frontend
using line maps or else the diagnostics infrastructure), the chief visible
change is that the function location_get_source_line() should be passed an
expanded_location object instead of a separate filename and line number.  This
is not a big change because in most cases, this information came anyway from a
call to expand_location and the needed expanded_location object is readily
available. The new overload of location_get_source_line() uses the extra
information in the expanded_location object to obtain the data from the
in-memory buffer when it originated from an LC_GEN map.

Until the subsequent patch that starts using LC_GEN maps, none are yet
generated within GCC, hence nothing is added to the testsuite here; but all
relevant selftests have been extended to cover generated data maps in addition
to normal files.

libcpp/ChangeLog:

* include/line-map.h (enum lc_reason): Add LC_GEN.
(struct line_map_ordinary): Add new members to support LC_GEN concept.
(ORDINARY_MAP_FILE_NAME): Assert that map really does encode a file
and not generated data.
(ORDINARY_MAP_GENERATED_DATA_P): New function.
(ORDINARY_MAP_GENERATED_DATA): New function.
(ORDINARY_MAP_GENERATED_DATA_LEN): New function.
(ORDINARY_MAP_FILE_NAME_OR_DATA): New function.
(ORDINARY_MAPS_SAME_FILE_P): Declare new function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): Declare new function.
(LINEMAP_FILE): This was always a synonym for ORDINARY_MAP_FILE_NAME;
make this explicit.
(linemap_get_file_highest_location): Adjust prototype.
(linemap_add): Adjust prototype.
(class expanded_location): Add new members to store generated content.
* line-map.cc (ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
(ORDINARY_MAPS_SAME_FILE_P): New function.
(linemap_add): Add new argument DATA_LEN. Support generated data in
LC_GEN maps.
(linemap_check_files_exited): Adapt to API changes supporting LC_GEN.
(linemap_line_start): Likewise.
(linemap_position_for_loc_and_offset): Likewise.
(linemap_get_expansion_filename): Likewise.
(linemap_expand_location): Likewise.
(linemap_dump): Likewise.
(linemap_dump_location): Likewise.
(linemap_get_file_highest_location): Likewise.
* directives.cc (_cpp_do_file_change): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (make_range): Initialize new fields in
expanded_location.
(compatible_locations_p): Use new ORDINARY_MAPS_SAME_FILE_P ()
function.
(layout::calculate_x_offset_display): Use the new expanded_location
overload of location_get_source_line(), so as to support LC_GEN maps.
(layout::print_line): Likewise.
(source_line::source_line): Likewise.
(line_corrections::add_hint): Likewise.
(class line_corrections): Store the location as an exploc rather than
individual filename, so as to support LC_GEN maps.
(layout::print_trailing_fixits): Use the new exploc constructor for
class line_corrections.
(test_layout_x_offset_display_utf8): Test LC_GEN maps as well as normal.
(test_layout_x_offset_display_tab): Likewise.
(test_diagnostic_show_locus_one_liner): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Likewise.
(test_add_location_if_nearby): Likewise.
(test_diagnostic_show_locus_fixit_lines): Likewise.
(test_fixit_consolidation): Likewise.
(test_overlapped_fixit_printing): Likewise.

[PATCH v3 3/4] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings

2023-07-21 Thread Lewis Hyatt via Gcc-patches
Currently, the tokens obtained from a destringified _Pragma string do not get
assigned proper locations while they are being lexed.  After the tokens have
been obtained, they are reassigned the same location as the _Pragma token,
which is sufficient to make things like _Pragma("GCC diagnostic ignored...")
operate correctly, but this still results in inferior diagnostics, since the
diagnostics do not point to the problematic tokens.  Further, if a diagnostic
is issued by libcpp during the lexing of the tokens, as opposed to being
issued by the frontend during the processing of the pragma, then the
patched-up location is not yet in place, and the user rather sees an invalid
location that is near to the location of the _Pragma string in some cases, or
potentially very far away, depending on the macro expansion history.  For
example:

=
_Pragma("GCC diagnostic ignored \"oops")
=

produces the diagnostic:

file.cpp:1:24: warning: missing terminating " character
1 | _Pragma("GCC diagnostic ignored \"oops")
  |^

with the caret in a nonsensical location, while this one:

=
 #define S "GCC diagnostic ignored \"oops"
_Pragma(S)
=

produces:

file.cpp:2:24: warning: missing terminating " character
2 | _Pragma(S)
  |^

with both the caret in a nonsensical location, and the actual relevant context
completely absent.

Fix this by assigning proper locations using the new LC_GEN type of linemap.
Now the tokens are given locations inside a generated content buffer, and the
macro expansion stack is modified to be aware that these tokens logically
belong to the "expansion" of the _Pragma directive. For the above examples we
now output:

==
In buffer generated from file.cpp:1:
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:1:1: note: in <_Pragma directive>
1 | _Pragma("GCC diagnostic ignored \"oops")
  | ^~~
==

and

==
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:2:1: note: in <_Pragma directive>
2 | _Pragma(S)
  | ^~~
==

So that carets are pointing to something meaningful and all relevant context
appears in the diagnostic.  For the second example, it would be nice if the
macro expansion also output "in expansion of macro S", however doing that for
a general case of macro expansions makes the logic very complicated, since it
has to be done after the fact when the macro maps have already been
constructed.  It doesn't seem worth it for this case, given that the _Pragma
string has already been output once on the first line.

gcc/ChangeLog:

* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Add awareness
of _Pragma directive to the macro expansion trace.

libcpp/ChangeLog:

* directives.cc (get_token_no_padding): Add argument to receive the
virtual location of the token.
(get__Pragma_string): Likewise.
(do_pragma): Set pfile->directive_result->src_loc properly, it should
not be a virtual location.
(destringize_and_run): Update to provide proper locations for the
_Pragma string tokens.  Support raw strings.
(_cpp_do__Pragma): Adapt to changes to the helper functions.
* errors.cc (cpp_diagnostic_at): Support
cpp_reader::diagnostic_rebase_loc.
(cpp_diagnostic_with_line): Likewise.
* include/line-map.h (class rich_location): Add new member
forget_cached_expanded_locations().
* internal.h (struct _cpp__Pragma_state): Define new struct.
(_cpp_rebase_diagnostic_location): Declare new function.
(struct cpp_reader): Add diagnostic_rebase_loc member.
(_cpp_push__Pragma_token_context): Declare new function.
(_cpp_do__Pragma): Adjust prototype.
* macro.cc (pragma_str): New static var.
(builtin_macro): Adapt to new implementation of _Pragma processing.
(_cpp_pop_context): Fix the logic for resetting
pfile->top_most_macro_node, which previously was never triggered,
although the error seems to have been harmless.
(_cpp_push__Pragma_token_context): New function.
(_cpp_rebase_diagnostic_location): New function.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (token_streamer::stream): Pass the virtual location of
the _Pragma token to maybe_print_line(), not the spelling location.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adjust for new
macro tracking output for _Pragma directives.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/diagnostic-pragma-1.c: Adjust for new macro
tracking output for _Pragma directives.
* c-c++-common/cpp/pr57580.c: Likewise.
* c-c++-common/gomp/pragma-3.c: Likewise.

[PATCH v3 4/4] diagnostics: Support generated data locations in SARIF output

2023-07-21 Thread Lewis Hyatt via Gcc-patches
The diagnostics routines for SARIF output need to read the source code back
in, so that they can generate "snippet" and "content" records, so they need to
be able to cope with generated data locations.  Add support for that in
diagnostic-format-sarif.cc.

gcc/ChangeLog:

* diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New function.
(sarif_builder::maybe_make_physical_location_object): Support
generated data locations.
(sarif_builder::make_artifact_location_object): Likewise.
(sarif_builder::maybe_make_region_object_for_context): Likewise.
(sarif_builder::make_artifact_object): Likewise.
(sarif_builder::maybe_make_artifact_content_object): Likewise.
(get_source_lines): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/diagnostic-format-sarif-file-5.c: New test.
---
 gcc/diagnostic-format-sarif.cc| 115 +++---
 .../diagnostic-format-sarif-file-5.c  |  31 +
 2 files changed, 99 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 5e483988027..29f614124b2 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -173,7 +173,10 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+
+  typedef std::pair filename_or_buffer;
+  json::object *make_artifact_location_object (filename_or_buffer fb);
+
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -196,16 +199,17 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) 
const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
-   int start_line,
+  json::object *make_artifact_object (filename_or_buffer fb);
+  json::object *
+  maybe_make_artifact_content_object (filename_or_buffer fb) const;
+  json::object *maybe_make_artifact_content_object (expanded_location xloc,
int end_line) const;
   json::object *make_fix_object (const rich_location _loc);
   json::object *make_artifact_change_object (const rich_location );
   json::object *make_replacement_object (const fixit_hint ) const;
   json::object *make_artifact_content_object (const char *text) const;
   int get_sarif_column (expanded_location exploc) const;
+  static filename_or_buffer xloc_to_fb (expanded_location xloc);
 
   diagnostic_context *m_context;
 
@@ -219,7 +223,11 @@ private:
  diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set  m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+ with that length, not a filename.  */
+  hash_set ,
+  int_hash  >
+   > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set  m_rule_id_set;
   json::array *m_rules_arr;
@@ -749,6 +757,15 @@ sarif_builder::make_location_object (const 
diagnostic_event )
   return location_obj;
 }
 
+/* Populate a filename_or_buffer pair from an expanded location.  */
+sarif_builder::filename_or_buffer
+sarif_builder::xloc_to_fb (expanded_location xloc)
+{
+  if (xloc.generated_data_len)
+return filename_or_buffer (xloc.generated_data, xloc.generated_data_len);
+  return filename_or_buffer (xloc.file, 0);
+}
+
 /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC,
or return NULL;
Add any filename to the m_artifacts.  */
@@ -764,7 +781,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  m_filenames.add (xloc_to_fb (expand_location (loc)));
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -788,7 +805,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (xloc_to_fb (expand_location (loc)));
 }
 
 /* The 

[PATCH v3 2/4] diagnostics: Handle generated data locations in edit_context

2023-07-21 Thread Lewis Hyatt via Gcc-patches
Class edit_context handles outputting fixit hints in diff form that could be
manually or automatically applied by the user. This will not make sense for
generated data locations, such as the contents of a _Pragma string, because
the text to be modified does not appear in the user's input files. We do not
currently ever generate fixit hints in such a context, but for future-proofing
purposes, ignore such locations in edit context now.

gcc/ChangeLog:

* edit-context.cc (edit_context::apply_fixit): Ignore locations in
generated data.
---
 gcc/edit-context.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
index 6f5bc6b9d8f..ae11b6f2e00 100644
--- a/gcc/edit-context.cc
+++ b/gcc/edit-context.cc
@@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint *hint)
 return false;
   if (start.column == 0)
 return false;
+  if (start.generated_data)
+return false;
   if (next_loc.column == 0)
 return false;
+  if (next_loc.generated_data)
+return false;
 
   edited_file  = get_or_insert_file (start.file);
   if (!m_valid)


[PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens

2023-07-21 Thread Lewis Hyatt via Gcc-patches
Hello-

This is an update to the v2 patch series last sent in January:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html

While I did not receive any feedback on the v2 patches yet, they did need some
rebasing on top of other recent commits to input.cc, so I thought it would be
helpful to send them again now. The patches have not otherwise changed from
v2, and the above-linked message explains how all the patches fit in with the
original v1 series sent last November.

Dave, I would appreciate it very much if you could please let me know what you
think of this approach? I feel like the diagnostics we currently
output for _Pragmas are worth improving. As a reminder, say for this example:

=
 #define S "GCC diagnostic ignored \"oops"
 _Pragma(S)
=

We currently output:

=
file.cpp:2:24: warning: missing terminating " character
2 | _Pragma(S)
  |^
=

While after these patches, we would output:

==
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:2:1: note: in <_Pragma directive>
2 | _Pragma(S)
  | ^~~
==

Thanks!

-Lewis


[committed] testsuite: Fix C++ UDL tests failing on 32-bit arch [PR103902]

2023-07-19 Thread Lewis Hyatt via Gcc-patches
These tests need to use "size_t" rather than "unsigned long"
for the user-defined literal function arguments.

gcc/testsuite/ChangeLog:

PR preprocessor/103902
* g++.dg/cpp0x/udlit-extended-id-1.C: Change "unsigned long" to
"size_t" throughout.
* g++.dg/cpp0x/udlit-extended-id-3.C: Likewise.
---

Notes:
Hello-

As noted on the PR, these newly added tests fail on 32-bit architectures
because they said "unsigned long" where they are supposed to say "size_t".
Committed this fix as obvious.

-Lewis

 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C | 9 +
 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C | 6 --
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
index 5ea5ef09db6..c7091e9e8a2 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
@@ -1,6 +1,7 @@
 // { dg-do run { target c++11 } }
 // { dg-additional-options "-Wno-error=normalized" }
 #include 
+#include 
 using namespace std;
 
 constexpr unsigned long long operator "" _π (unsigned long long x)
@@ -21,22 +22,22 @@ char x2[2_Π2];
 static_assert (sizeof x1 == 3, "test1");
 static_assert (sizeof x2 == 8, "test2");
 
-const char * operator "" _1σ (const char *s, unsigned long)
+const char * operator "" _1σ (const char *s, size_t)
 {
   return s + 1;
 }
 
-const char * operator ""_Σ2 (const char *s, unsigned long)
+const char * operator ""_Σ2 (const char *s, size_t)
 {
   return s + 2;
 }
 
-const char * operator "" _\U00e61 (const char *s, unsigned long)
+const char * operator "" _\U00e61 (const char *s, size_t)
 {
   return "ae";
 }
 
-const char* operator ""_\u01532 (const char *s, unsigned long)
+const char* operator ""_\u01532 (const char *s, size_t)
 {
   return "oe";
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
index 11292e476e3..cb8a957947a 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
@@ -1,9 +1,11 @@
 // { dg-do compile { target c++11 } }
+#include 
+using namespace std;
 
 // Check that we do not look for poisoned identifier when it is a suffix.
 int _ħ;
 #pragma GCC poison _ħ
-const char * operator ""_ħ (const char *, unsigned long); // { dg-bogus 
"poisoned" }
+const char * operator ""_ħ (const char *, size_t); // { dg-bogus "poisoned" }
 bool operator ""_ħ (unsigned long long x); // { dg-bogus "poisoned" }
 bool b = 1_ħ; // { dg-bogus "poisoned" }
 const char *x = "hbar"_ħ; // { dg-bogus "poisoned" }
@@ -11,5 +13,5 @@ const char *x = "hbar"_ħ; // { dg-bogus "poisoned" }
 /* Ideally, we should not warn here either, but this is not implemented yet.  
This
syntax has been deprecated for C++23.  */
 #pragma GCC poison _ħ2
-const char * operator "" _ħ2 (const char *, unsigned long); // { dg-bogus 
"poisoned" "" { xfail *-*-*} }
+const char * operator "" _ħ2 (const char *, size_t); // { dg-bogus "poisoned" 
"" { xfail *-*-*} }
 const char *x2 = "hbar2"_ħ2; // { dg-bogus "poisoned" }


Ping: [PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-07-11 Thread Lewis Hyatt via Gcc-patches
May I please ping this patch again? I think it would be worthwhile to
close this gap in the support for UTF-8 sources. Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html

-Lewis

On Fri, Jun 2, 2023 at 9:45 AM Lewis Hyatt  wrote:
>
> Hello-
>
> Ping please? Thanks.
> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html
>
> -Lewis
>
> On Tue, May 2, 2023 at 9:27 AM Lewis Hyatt  wrote:
> >
> > May I please ping this one? Thanks...
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html
> >
> > On Thu, Mar 2, 2023 at 6:21 PM Lewis Hyatt  wrote:
> > >
> > > The PR complains that we do not handle UTF-8 in the suffix for a 
> > > user-defined
> > > literal, such as:
> > >
> > > bool operator ""_π (unsigned long long);
> > >
> > > In fact we don't handle any extended identifier characters there, whether
> > > UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space 
> > > after
> > > the "" tokens is included, since then the identifier is lexed in the 
> > > "normal"
> > > way as its own token. But when it is lexed as part of the string token, 
> > > this
> > > is handled in lex_string() with a one-off loop that is not aware of 
> > > extended
> > > characters.
> > >
> > > This patch fixes it by adding a new function scan_cur_identifier() that 
> > > can be
> > > used to lex an identifier while in the middle of lexing another token.
> > >
> > > BTW, the other place that has been mis-lexing identifiers is
> > > lex_identifier_intern(), which is used to implement #pragma push_macro
> > > and #pragma pop_macro. This does not support extended characters either.
> > > I will add that in a subsequent patch, because it can't directly reuse the
> > > new function, but rather needs to lex from a string instead of a 
> > > cpp_buffer.
> > >
> > > With scan_cur_identifier(), we do also correctly warn about bidi and
> > > normalization issues in the extended identifiers comprising the suffix.
> > >
> > > libcpp/ChangeLog:
> > >
> > > PR preprocessor/103902
> > > * lex.cc (identifier_diagnostics_on_lex): New function refactoring
> > > some common code.
> > > (lex_identifier_intern): Use the new function.
> > > (lex_identifier): Don't run identifier diagnostics here, rather 
> > > let
> > > the call site do it when needed.
> > > (_cpp_lex_direct): Adjust the call sites of lex_identifier ()
> > > acccordingly.
> > > (struct scan_id_result): New struct.
> > > (scan_cur_identifier): New function.
> > > (create_literal2): New function.
> > > (lit_accum::create_literal2): New function.
> > > (is_macro): Folded into new function...
> > > (maybe_ignore_udl_macro_suffix): ...here.
> > > (is_macro_not_literal_suffix): Folded likewise.
> > > (lex_raw_string): Handle UTF-8 in UDL suffix via 
> > > scan_cur_identifier ().
> > > (lex_string): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR preprocessor/103902
> > > * g++.dg/cpp0x/udlit-extended-id-1.C: New test.
> > > * g++.dg/cpp0x/udlit-extended-id-2.C: New test.
> > > * g++.dg/cpp0x/udlit-extended-id-3.C: New test.
> > > * g++.dg/cpp0x/udlit-extended-id-4.C: New test.
> > > ---
> > >
> > > Notes:
> > > Hello-
> > >
> > > This is the updated version of the patch, incorporating feedback from 
> > > Jakub
> > > and Jason, most recently discussed here:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html
> > >
> > > Please let me know how it looks? It is simpler than before with the 
> > > new
> > > approach. Thanks!
> > >
> > > One thing to note. As Jason clarified for me, a usage like this:
> > >
> > >  #pragma GCC poison _x
> > > const char * operator "" _x (const char *, unsigned long);
> > >
> > > The space between the "" and the _x is currently allowed but will be
> > > deprecated in C++23. GCC currently will complain about the poisoned 
> > > use of
> > > _x in this case, and this patch, which is just focus

Re: 'unsigned int len' field in 'libcpp/include/symtab.h:struct ht_identifier' (was: [PATCH] pch: Fix streaming of strings with embedded null bytes)

2023-07-04 Thread Lewis Hyatt via Gcc-patches
On Tue, Jul 4, 2023 at 11:50 AM Thomas Schwinge  wrote:
>
> Hi!
>
> I came across this one here on my way working through another (somewhat
> related) GTY issue.  I generally do understand the issue here, but do
> have a question about 'unsigned int len' field in
> 'libcpp/include/symtab.h:struct ht_identifier':
>
> On 2022-10-18T18:14:54-0400, Lewis Hyatt via Gcc-patches 
>  wrote:
> > When a GTY'ed struct is streamed to PCH, any plain char* pointers it 
> > contains
> > (whether they live in GC-controlled memory or not) will be marked for PCH
> > output by the routine gt_pch_note_object in ggc-common.cc. This routine
> > special-cases plain char* strings, and in particular it uses strlen() to get
> > their length.
>
> Oh, wow, this special casing for strings...  8-|
>
> > Thus it does not handle strings with embedded null bytes, but it
> > is possible for something PCH cares about (such as a string literal token 
> > in a
> > macro definition) to contain such embedded nulls. To fix that up, add a new
> > GTY option "string_length" so that gt_pch_note_object can be informed the
> > actual length it ought to use, and use it in the relevant libcpp structs
> > (cpp_string and ht_identifier) accordingly.
>
> For your test case:
>
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/pch-string-nulls.C
> > @@ -0,0 +1,3 @@
> > +// { dg-do compile { target c++11 } }
> > +#include "pch-string-nulls.H"
> > +static_assert (X[4] == '[' && X[5] == '!' && X[6] == ']', "error");
>
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/pch-string-nulls.Hs
> > @@ -0,0 +1,2 @@
> > +/* Note that there is a null byte following "ABC".  */
> > +#define X R"(ABC^@[!])"
>
> ..., I understand how the following is necessary:
>
> > --- a/libcpp/include/cpplib.h
> > +++ b/libcpp/include/cpplib.h
> > @@ -179,7 +179,11 @@ enum c_lang {CLK_GNUC89 = 0, CLK_GNUC99, CLK_GNUC11, 
> > CLK_GNUC17, CLK_GNUC2X,
> >  /* Payload of a NUMBER, STRING, CHAR or COMMENT token.  */
> >  struct GTY(()) cpp_string {
> >unsigned int len;
> > -  const unsigned char *text;
> > +
> > +  /* TEXT is always null terminated (terminator not included in len); but 
> > this
> > + GTY markup arranges that PCH streaming works properly even if there 
> > is a
> > + null byte in the middle of the string.  */
> > +  const unsigned char * GTY((string_length ("1 + %h.len"))) text;
> >  };
>
> (That is, the test case FAILs with that one reverted.)
>
> However, this one did confuse me:
>
> > --- a/libcpp/include/symtab.h
> > +++ b/libcpp/include/symtab.h
> > @@ -29,7 +29,10 @@ along with this program; see the file COPYING3.  If not 
> > see
> >  typedef struct ht_identifier ht_identifier;
> >  typedef struct ht_identifier *ht_identifier_ptr;
> >  struct GTY(()) ht_identifier {
> > -  const unsigned char *str;
> > +  /* This GTY markup arranges that the null-terminated identifier would 
> > still
> > + stream to PCH correctly, if a null byte were to make its way into an
> > + identifier somehow.  */
> > +  const unsigned char * GTY((string_length ("1 + %h.len"))) str;
> >unsigned int len;
> >unsigned int hash_value;
> >  };
>
> I did wonder whether that's an actual or just a theoretical concern, to
> have 'ht_identifier's with embedded NULs?  If an actual concern, can we
> get a test case constructed?  Otherwise, should we revert this hunk,
> given that we have this auto-'strlen' handling, ignorant of embedded
> NULs, in a lot of other places?
>
> But then I realized that possibly we do maintain 'len' here not for
> correctness but as an optimization (trading an 'unsigned int' for
> repeated 'strlen' calls)?  My quick testing with the attached
> "[RFC] Verify no embedded NULs in 'struct ht_identifier'" might confirm
> this theory: no regressions (..., but I didn't bootstrap, and ran only
> parts of the testsuite).  (Not proposing that RFC for 'git push', of
> course.)
>
> If that's indeed the intention here, I shall change/add source code
> commentary to describe this rationale for this dedicated 'len' field
> (plus, that handling of embedded NULs falls out of that, automatically).
>
> For reference, this 'len' field has existed "forever".  Before
> 'struct ht_identifier' was added in
> commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we
> had in 'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or
> earlier 'length', earlier in '

[PATCH] c-family: Implement pragma_lex () for preprocess-only mode

2023-06-30 Thread Lewis Hyatt via Gcc-patches
In order to support processing #pragma in preprocess-only mode (-E or
-save-temps for gcc/g++), we need a way to obtain the #pragma tokens from
libcpp. In full compilation modes, this is accomplished by calling
pragma_lex (), which is a symbol that must be exported by the frontend, and
which is currently implemented for C and C++. Neither of those frontends
initializes its parser machinery in preprocess-only mode, and consequently
pragma_lex () does not work in this case.

Address that by adding a new function c_init_preprocess () for the frontends
to implement, which arranges for pragma_lex () to work in preprocess-only
mode, and adjusting pragma_lex () accordingly.

In preprocess-only mode, the preprocessor is accustomed to controlling the
interaction with libcpp, and it only knows about tokens that it has called
into libcpp itself to obtain. Since it still needs to see the tokens
obtained by pragma_lex () so that they can be streamed to the output, also
add a new libcpp callback, on_token_lex (), that ensures the preprocessor
sees these tokens too.

Currently, there is one place where we are already supporting #pragma in
preprocess-only mode, namely the handling of `#pragma GCC diagnostic'.  That
was done by directly interfacing with libcpp, rather than making use of
pragma_lex (). Now that pragma_lex () works, that code is no longer
necessary; remove it.

gcc/c-family/ChangeLog:

* c-common.h (c_init_preprocess): Declare new function.
* c-opts.cc (c_common_init): Call it.
* c-pragma.cc (pragma_diagnostic_lex_normal): Rename to...
(pragma_diagnostic_lex): ...this.
(pragma_diagnostic_lex_pp): Remove.
(handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in
all modes.
(c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex ()
usage.
* c-pragma.h (pragma_lex_discard_to_eol): Declare new function.

gcc/c/ChangeLog:

* c-parser.cc (pragma_lex): Support preprocess-only mode.
(pragma_lex_discard_to_eol): New function.
(c_init_preprocess): New function.

gcc/cp/ChangeLog:

* parser.cc (c_init_preprocess): New function.
(maybe_read_tokens_for_pragma_lex): New function.
(pragma_lex): Support preprocess-only mode.
(pragma_lex_discard_to_eol): New funtion.

libcpp/ChangeLog:

* include/cpplib.h (struct cpp_callbacks): Add new callback
on_token_lex.
* macro.cc (cpp_get_token_1): Support new callback.
---

Notes:
Hello-

In r13-1544, I added support for processing `#pragma GCC diagnostic' in
preprocess-only mode. Because pragma_lex () doesn't work in that mode, in
that patch I called into libcpp directly to obtain the tokens needed to
process the pragma. As part of the review, Jason noted that it would
probably be better to make pragma_lex () usable in preprocess-only mode, and
we decided just to add a comment about that for the time being, and to go
ahead and implement that in the future, if it became necessary to support
other pragmas during preprocessing.

I think now is a good time to proceed with that plan, because I would like
to fix PR87299, which is about another pragma (#pragma GCC target) not
working in preprocess-only mode. This patch makes the necessary changes for
pragma_lex () to work in preprocess-only mode.

I have also added a new callback, on_token_lex (), to libcpp. This is so the
preprocessor can see and stream out all the tokens that pragma_lex () gets
from libcpp, since it won't otherwise see them.  This seemed the simplest
approach to me. Another possibility would be to add a wrapper function in
c-family/c-lex.cc, which would call cpp_get_token_with_location(), and then
also stream the token in preprocess-only mode, and then change all calls
into libcpp in that file to use the wrapper function.  The libcpp callback
seemed cleaner to me FWIW.

There are no new tests added here, since it's just a change of
implementation covered by existing tests. Bootstrap + regtest all languages
looks good on x86-64 Linux.

Please let me know what you think? Thanks!

-Lewis

 gcc/c-family/c-common.h  |  3 +++
 gcc/c-family/c-opts.cc   |  1 +
 gcc/c-family/c-pragma.cc | 56 ++--
 gcc/c-family/c-pragma.h  |  2 ++
 gcc/c/c-parser.cc| 34 
 gcc/cp/parser.cc | 50 +++
 libcpp/include/cpplib.h  |  4 +++
 libcpp/macro.cc  |  3 +++
 8 files changed, 105 insertions(+), 48 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b5ef5ff6b2c..78fc5248ba6 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -990,6 +990,9 @@ extern void c_parse_file (void);
 
 extern void c_parse_final_cleanups (void);
 
+/* This initializes for preprocess-only mode.  */
+extern void 

Re: ping: [PATCH] libcpp: Improve location for macro names [PR66290]

2023-06-19 Thread Lewis Hyatt via Gcc-patches
May I please ping this one? FWIW, it's 10 months old now without any feedback.
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607647.html

Most of the changes are just adapting the testsuite to look for the
improved diagnostic location. Otherwise it's a handful of lines in
libcpp and it just changes this:

t.cpp:1: warning: macro "X" is not used [-Wunused-macros]
1 | #define X 1
  |

to this:

t.cpp:1:9: warning: macro "X" is not used [-Wunused-macros]
1 | #define X 1
  | ^

which closes out PR66290. Thank you!

-Lewis

On Thu, Jan 12, 2023 at 6:31 PM Lewis Hyatt  wrote:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607647.html
> May I please ping this one again? It will enable closing out the PR. Thanks!
>
> -Lewis
>
> On Thu, Dec 1, 2022 at 9:22 AM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> >
> > May I please ping this one? Thanks!
> > I have also re-attached the rebased patch here.
> >
> > -Lewis
> >
> > On Wed, Oct 12, 2022 at 06:37:50PM -0400, Lewis Hyatt wrote:
> > > Hello-
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > >
> > > Since Jeff was kind enough to ack one of my other preprocessor patches
> > > today, I have become emboldened to ping this one again too :). Would
> > > anyone have some time to take a look at it please? Thanks!
> > >
> > > -Lewis
> > >
> > > On Thu, Sep 15, 2022 at 6:31 PM Lewis Hyatt  wrote:
> > > >
> > > > Hello-
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > > > May I please ping this patch? Thank you.
> > > >
> > > > -Lewis
> > > >
> > > > On Fri, Aug 5, 2022 at 12:14 PM Lewis Hyatt  wrote:
> > > > >
> > > > >
> > > > > When libcpp reports diagnostics whose locus is a macro name (such as 
> > > > > for
> > > > > -Wunused-macros), it uses the location in the cpp_macro object that 
> > > > > was
> > > > > stored by _cpp_new_macro. This is currently set to 
> > > > > pfile->directive_line,
> > > > > which contains the line number only and no column information. This 
> > > > > patch
> > > > > changes the stored location to the src_loc for the token defining the 
> > > > > macro
> > > > > name, which includes the location and range information.
> > > > >
> > > > > libcpp/ChangeLog:
> > > > >
> > > > > PR c++/66290
> > > > > * macro.cc (_cpp_create_definition): Add location argument.
> > > > > * internal.h (_cpp_create_definition): Adjust prototype.
> > > > > * directives.cc (do_define): Pass new location argument to
> > > > > _cpp_create_definition.
> > > > > (do_undef): Stop passing inferior location to 
> > > > > cpp_warning_with_line;
> > > > > the default from cpp_warning is better.
> > > > > (cpp_pop_definition): Pass new location argument to
> > > > > _cpp_create_definition.
> > > > > * pch.cc (cpp_read_state): Likewise.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > PR c++/66290
> > > > > * c-c++-common/cpp/macro-ranges.c: New test.
> > > > > * c-c++-common/cpp/line-2.c: Adapt to check for column 
> > > > > information
> > > > > on macro-related libcpp warnings.
> > > > > * c-c++-common/cpp/line-3.c: Likewise.
> > > > > * c-c++-common/cpp/macro-arg-count-1.c: Likewise.
> > > > > * c-c++-common/cpp/pr58844-1.c: Likewise.
> > > > > * c-c++-common/cpp/pr58844-2.c: Likewise.
> > > > > * c-c++-common/cpp/warning-zero-location.c: Likewise.
> > > > > * c-c++-common/pragma-diag-14.c: Likewise.
> > > > > * c-c++-common/pragma-diag-15.c: Likewise.
> > > > > * g++.dg/modules/macro-2_d.C: Likewise.
> > > > > * g++.dg/modules/macro-4_d.C: Likewise.
> > > > > * g++.dg/modules/macro-4_e.C: Likewise.
> > > > > * g++.dg/spellcheck-macro-ordering.C: Likewise.
> > > > > *

Re: [pushed] diagnostics: ensure that .sarif files are UTF-8 encoded [PR109098]

2023-06-11 Thread Lewis Hyatt via Gcc-patches
On Fri, Mar 24, 2023 at 9:04 PM David Malcolm via Gcc-patches
 wrote:
>
> PR analyzer/109098 notes that the SARIF spec mandates that .sarif
> files are UTF-8 encoded, but -fdiagnostics-format=sarif-file naively
> assumes that the source files are UTF-8 encoded when quoting source
> artefacts in the .sarif output, which can lead to us writing out
> .sarif files with non-UTF-8 bytes in them (which break my reporting
> scripts).
>
> The root cause is that sarif_builder::maybe_make_artifact_content_object
> was using maybe_read_file to load the file content as bytes, and
> assuming they were UTF-8 encoded.
>
> This patch reworks both overloads of this function (one used for the
> whole file, the other for snippets of quoted lines) so that they go
> through input.cc's file cache, which attempts to decode the input files
> according to the input charset, and then encode as UTF-8.  They also
> check that the result actually is UTF-8, for cases where the input
> charset is missing, or incorrectly specified, and omit the quoted
> source for such awkward cases.
>
> Doing so fixes all of the cases I've encountered.
>
> The patch adds a new:
>   { dg-final { verify-sarif-file } }
> directive to all SARIF test cases in the test suite, which verifies
> that the output is UTF-8 encoded, and is valid JSON.  In particular
> it verifies that when we complain about encoding problems, the .sarif
> report we emit is itself correctly encoded.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Integration testing shows no regressions, and a fix for the case
> seen in haproxy-2.7.1.
> Pushed to trunk as r13-6861-gd495ea2b232f3e.

Hi David-

Regarding the patch series I had about _Pragma locations (most
recently https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609472.html
and https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html).
That one will need some work now in order to apply on top of these
changes to input.cc. Happy to do that, but I thought I better check in
first to see if you had any feedback please on the new approach to
input.cc that's in the v2 patch? Do you think it's a worthwhile
feature, or you'd rather I just drop it? Thanks!

-Lewis


Ping: [PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-06-02 Thread Lewis Hyatt via Gcc-patches
Hello-

Ping please? Thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html

-Lewis

On Tue, May 2, 2023 at 9:27 AM Lewis Hyatt  wrote:
>
> May I please ping this one? Thanks...
> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html
>
> On Thu, Mar 2, 2023 at 6:21 PM Lewis Hyatt  wrote:
> >
> > The PR complains that we do not handle UTF-8 in the suffix for a 
> > user-defined
> > literal, such as:
> >
> > bool operator ""_π (unsigned long long);
> >
> > In fact we don't handle any extended identifier characters there, whether
> > UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
> > the "" tokens is included, since then the identifier is lexed in the 
> > "normal"
> > way as its own token. But when it is lexed as part of the string token, this
> > is handled in lex_string() with a one-off loop that is not aware of extended
> > characters.
> >
> > This patch fixes it by adding a new function scan_cur_identifier() that can 
> > be
> > used to lex an identifier while in the middle of lexing another token.
> >
> > BTW, the other place that has been mis-lexing identifiers is
> > lex_identifier_intern(), which is used to implement #pragma push_macro
> > and #pragma pop_macro. This does not support extended characters either.
> > I will add that in a subsequent patch, because it can't directly reuse the
> > new function, but rather needs to lex from a string instead of a cpp_buffer.
> >
> > With scan_cur_identifier(), we do also correctly warn about bidi and
> > normalization issues in the extended identifiers comprising the suffix.
> >
> > libcpp/ChangeLog:
> >
> > PR preprocessor/103902
> > * lex.cc (identifier_diagnostics_on_lex): New function refactoring
> > some common code.
> > (lex_identifier_intern): Use the new function.
> > (lex_identifier): Don't run identifier diagnostics here, rather let
> > the call site do it when needed.
> > (_cpp_lex_direct): Adjust the call sites of lex_identifier ()
> > acccordingly.
> > (struct scan_id_result): New struct.
> > (scan_cur_identifier): New function.
> > (create_literal2): New function.
> > (lit_accum::create_literal2): New function.
> > (is_macro): Folded into new function...
> > (maybe_ignore_udl_macro_suffix): ...here.
> > (is_macro_not_literal_suffix): Folded likewise.
> > (lex_raw_string): Handle UTF-8 in UDL suffix via 
> > scan_cur_identifier ().
> > (lex_string): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR preprocessor/103902
> > * g++.dg/cpp0x/udlit-extended-id-1.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-2.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-3.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-4.C: New test.
> > ---
> >
> > Notes:
> > Hello-
> >
> > This is the updated version of the patch, incorporating feedback from 
> > Jakub
> > and Jason, most recently discussed here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html
> >
> > Please let me know how it looks? It is simpler than before with the new
> > approach. Thanks!
> >
> > One thing to note. As Jason clarified for me, a usage like this:
> >
> >  #pragma GCC poison _x
> > const char * operator "" _x (const char *, unsigned long);
> >
> > The space between the "" and the _x is currently allowed but will be
> > deprecated in C++23. GCC currently will complain about the poisoned use 
> > of
> > _x in this case, and this patch, which is just focused on handling UTF-8
> > properly, does not change this. But it seems that it would be correct
> > not to apply poison in this case. I can try to follow up with a patch 
> > to do
> > so, if it seems worthwhile? Given the syntax is deprecated, maybe it's 
> > not
> > worth it...
> >
> > For the time being, this patch does add a testcase for the above and 
> > xfails
> > it. For the case where no space is present, which is the part touched 
> > by the
> > present patch, existing behavior is preserved correctly and no 
> > diagnostics
> > such as poison are issued for the UDL suffix. (Contrary to v1 of this
> > patch.)
> >
> > Thanks! bootstrap + regtested all 

Ping: [PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-05-02 Thread Lewis Hyatt via Gcc-patches
May I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html

On Thu, Mar 2, 2023 at 6:21 PM Lewis Hyatt  wrote:
>
> The PR complains that we do not handle UTF-8 in the suffix for a user-defined
> literal, such as:
>
> bool operator ""_π (unsigned long long);
>
> In fact we don't handle any extended identifier characters there, whether
> UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
> the "" tokens is included, since then the identifier is lexed in the "normal"
> way as its own token. But when it is lexed as part of the string token, this
> is handled in lex_string() with a one-off loop that is not aware of extended
> characters.
>
> This patch fixes it by adding a new function scan_cur_identifier() that can be
> used to lex an identifier while in the middle of lexing another token.
>
> BTW, the other place that has been mis-lexing identifiers is
> lex_identifier_intern(), which is used to implement #pragma push_macro
> and #pragma pop_macro. This does not support extended characters either.
> I will add that in a subsequent patch, because it can't directly reuse the
> new function, but rather needs to lex from a string instead of a cpp_buffer.
>
> With scan_cur_identifier(), we do also correctly warn about bidi and
> normalization issues in the extended identifiers comprising the suffix.
>
> libcpp/ChangeLog:
>
> PR preprocessor/103902
> * lex.cc (identifier_diagnostics_on_lex): New function refactoring
> some common code.
> (lex_identifier_intern): Use the new function.
> (lex_identifier): Don't run identifier diagnostics here, rather let
> the call site do it when needed.
> (_cpp_lex_direct): Adjust the call sites of lex_identifier ()
> acccordingly.
> (struct scan_id_result): New struct.
> (scan_cur_identifier): New function.
> (create_literal2): New function.
> (lit_accum::create_literal2): New function.
> (is_macro): Folded into new function...
> (maybe_ignore_udl_macro_suffix): ...here.
> (is_macro_not_literal_suffix): Folded likewise.
> (lex_raw_string): Handle UTF-8 in UDL suffix via scan_cur_identifier 
> ().
> (lex_string): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/103902
> * g++.dg/cpp0x/udlit-extended-id-1.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-2.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-3.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-4.C: New test.
> ---
>
> Notes:
> Hello-
>
> This is the updated version of the patch, incorporating feedback from 
> Jakub
> and Jason, most recently discussed here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html
>
> Please let me know how it looks? It is simpler than before with the new
> approach. Thanks!
>
> One thing to note. As Jason clarified for me, a usage like this:
>
>  #pragma GCC poison _x
> const char * operator "" _x (const char *, unsigned long);
>
> The space between the "" and the _x is currently allowed but will be
> deprecated in C++23. GCC currently will complain about the poisoned use of
> _x in this case, and this patch, which is just focused on handling UTF-8
> properly, does not change this. But it seems that it would be correct
> not to apply poison in this case. I can try to follow up with a patch to 
> do
> so, if it seems worthwhile? Given the syntax is deprecated, maybe it's not
> worth it...
>
> For the time being, this patch does add a testcase for the above and 
> xfails
> it. For the case where no space is present, which is the part touched by 
> the
> present patch, existing behavior is preserved correctly and no diagnostics
> such as poison are issued for the UDL suffix. (Contrary to v1 of this
> patch.)
>
> Thanks! bootstrap + regtested all languages on x86-64 Linux with
> no regressions.
>
> -Lewis
>
>  .../g++.dg/cpp0x/udlit-extended-id-1.C|  68 
>  .../g++.dg/cpp0x/udlit-extended-id-2.C|   6 +
>  .../g++.dg/cpp0x/udlit-extended-id-3.C|  15 +
>  .../g++.dg/cpp0x/udlit-extended-id-4.C|  14 +
>  libcpp/lex.cc | 382 ++
>  5 files changed, 317 insertions(+), 168 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-2.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
>  create

Ping: [PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-04-04 Thread Lewis Hyatt via Gcc-patches
May I please ping this one?
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html

Thanks!

-Lewis

On Thu, Mar 2, 2023 at 6:21 PM Lewis Hyatt  wrote:
>
> The PR complains that we do not handle UTF-8 in the suffix for a user-defined
> literal, such as:
>
> bool operator ""_π (unsigned long long);
>
> In fact we don't handle any extended identifier characters there, whether
> UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
> the "" tokens is included, since then the identifier is lexed in the "normal"
> way as its own token. But when it is lexed as part of the string token, this
> is handled in lex_string() with a one-off loop that is not aware of extended
> characters.
>
> This patch fixes it by adding a new function scan_cur_identifier() that can be
> used to lex an identifier while in the middle of lexing another token.
>
> BTW, the other place that has been mis-lexing identifiers is
> lex_identifier_intern(), which is used to implement #pragma push_macro
> and #pragma pop_macro. This does not support extended characters either.
> I will add that in a subsequent patch, because it can't directly reuse the
> new function, but rather needs to lex from a string instead of a cpp_buffer.
>
> With scan_cur_identifier(), we do also correctly warn about bidi and
> normalization issues in the extended identifiers comprising the suffix.
>
> libcpp/ChangeLog:
>
> PR preprocessor/103902
> * lex.cc (identifier_diagnostics_on_lex): New function refactoring
> some common code.
> (lex_identifier_intern): Use the new function.
> (lex_identifier): Don't run identifier diagnostics here, rather let
> the call site do it when needed.
> (_cpp_lex_direct): Adjust the call sites of lex_identifier ()
> acccordingly.
> (struct scan_id_result): New struct.
> (scan_cur_identifier): New function.
> (create_literal2): New function.
> (lit_accum::create_literal2): New function.
> (is_macro): Folded into new function...
> (maybe_ignore_udl_macro_suffix): ...here.
> (is_macro_not_literal_suffix): Folded likewise.
> (lex_raw_string): Handle UTF-8 in UDL suffix via scan_cur_identifier 
> ().
> (lex_string): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/103902
> * g++.dg/cpp0x/udlit-extended-id-1.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-2.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-3.C: New test.
> * g++.dg/cpp0x/udlit-extended-id-4.C: New test.
> ---
>
> Notes:
> Hello-
>
> This is the updated version of the patch, incorporating feedback from 
> Jakub
> and Jason, most recently discussed here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html
>
> Please let me know how it looks? It is simpler than before with the new
> approach. Thanks!
>
> One thing to note. As Jason clarified for me, a usage like this:
>
>  #pragma GCC poison _x
> const char * operator "" _x (const char *, unsigned long);
>
> The space between the "" and the _x is currently allowed but will be
> deprecated in C++23. GCC currently will complain about the poisoned use of
> _x in this case, and this patch, which is just focused on handling UTF-8
> properly, does not change this. But it seems that it would be correct
> not to apply poison in this case. I can try to follow up with a patch to 
> do
> so, if it seems worthwhile? Given the syntax is deprecated, maybe it's not
> worth it...
>
> For the time being, this patch does add a testcase for the above and 
> xfails
> it. For the case where no space is present, which is the part touched by 
> the
> present patch, existing behavior is preserved correctly and no diagnostics
> such as poison are issued for the UDL suffix. (Contrary to v1 of this
> patch.)
>
> Thanks! bootstrap + regtested all languages on x86-64 Linux with
> no regressions.
>
> -Lewis
>
>  .../g++.dg/cpp0x/udlit-extended-id-1.C|  68 
>  .../g++.dg/cpp0x/udlit-extended-id-2.C|   6 +
>  .../g++.dg/cpp0x/udlit-extended-id-3.C|  15 +
>  .../g++.dg/cpp0x/udlit-extended-id-4.C|  14 +
>  libcpp/lex.cc | 382 ++
>  5 files changed, 317 insertions(+), 168 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-2.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
>  

Re: [PATCH] libcpp: Update to Unicode 15

2023-03-09 Thread Lewis Hyatt via Gcc-patches
On Fri, Nov 04, 2022 at 10:03:13AM +0100, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> The following pseudo-patch (for uname2c.h part
> just a pseudo patch with a lot of changes replaced with ...
> because it is too large but the important changes like
> -static const char uname2c_dict[59418] =
> +static const char uname2c_dict[59891] =
> -static const unsigned char uname2c_tree[208765] = {
> +static const unsigned char uname2c_tree[210697] = {
> are shown, full patch xz compressed will be posted separately
> due to mail limit) regenerates the libcpp tables with Unicode 15.0.0
> which added 4489 new characters.
> 
> As mentioned previously, this isn't just a matter of running the
> two libcpp/make*.cc programs on the new Unicode files, but one needs
> to manually update a table inside of makeuname2c.cc according to
> a table in Unicode text (which is partially reflected in the text
> files, but e.g. in Unicode 14.0.0 not 100% accurately, in 15.0.0
> actually accurately).
> I've also added some randomly chosen subset of those 4489 new
> characters to a testcase.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Hi Jakub-

In addition to these files you updated last year for Unicode 15, we also need
to update generated_cpp_wcwidth.h, which implements cpp_wcwidth() for
diagnostics so we can output correct column numbers. There is a procedure
outlined in the file contrib/unicode/README that accomplishes this. Is it OK
to push the attached patch (gzipped since it is large and uninformative),
which is the result of following the procedure? It went straightforwardly as
expected, and bootstrap+regtest on x86-64 Linux is clean. Thanks!

-Lewis
[PATCH] libcpp: Update cpp_wcwidth() to Unicode 15

Updates cpp_wcwidth() to Unicode 15, following the procedure in
contrib/unicode/README mechanically without incident.

contrib/ChangeLog:

* unicode/DerivedCoreProperties.txt: Update to Unicode 15.
* unicode/DerivedNormalizationProps.txt: Likewise.
* unicode/EastAsianWidth.txt: Likwise.
* unicode/PropList.txt: Likewise.
* unicode/README: Likewise.
* unicode/UnicodeData.txt: Likewise.

libcpp/ChangeLog:

* generated_cpp_wcwidth.h: Regenerated for Unicode 15.


unicode_15_wcwidth-1.txt.gz
Description: application/gunzip


Ping: [PATCH] libcpp: Fix ICE on directive inside _Pragma() operator [PR67046]

2023-03-07 Thread Lewis Hyatt via Gcc-patches
Hello-

May I please ping this short patch that fixes an old bug? Thanks...

-Lewis

On Sat, Jan 14, 2023 at 1:46 PM Lewis Hyatt  wrote:
>
> get__Pragma_string() in directives.cc is responsible for lexing the parens
> and the string argument from a _Pragma("...") operator. This function does
> not handle the case when the closing paren is not on the same line as the
> string; in that case, libcpp will by default reuse the token buffer it
> previously used for the string, so that the string token returned by
> get__Pragma_string() may be corrupted, as shown in the testcase. Fix using
> the existing keep_tokens mechanism that temporarily disables the reuse of
> token buffers.
>
> libcpp/ChangeLog:
>
> PR preprocessor/67046
> * directives.cc (_cpp_do__Pragma): Increment pfile->keep_tokens to
> ensure the returned string token is valid.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/67046
> * c-c++-common/cpp/pr67046.c: New test.
> ---
>
> Notes:
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67046
>
> This fixes an old ICE in libcpp that can happen when lexing the tokens 
> from a
> _Pragma operator. Bootstrapped+tested on x86-64 Linux with no
> regressions. Please let me know if it's OK? Thanks...
>
> -Lewis
>
>  gcc/testsuite/c-c++-common/cpp/pr67046.c | 10 ++
>  libcpp/directives.cc |  5 +
>  2 files changed, 15 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr67046.c
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr67046.c 
> b/gcc/testsuite/c-c++-common/cpp/pr67046.c
> new file mode 100644
> index 000..f37f20c624e
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr67046.c
> @@ -0,0 +1,10 @@
> +/* { dg-do preprocess } */
> +
> +_Pragma(
> +"message(\"msg\")"
> +)
> +
> +_Pragma(
> +"message(\"msg\")"
> +#
> +)
> diff --git a/libcpp/directives.cc b/libcpp/directives.cc
> index 9dc4363c65a..ffd262bce7d 100644
> --- a/libcpp/directives.cc
> +++ b/libcpp/directives.cc
> @@ -1996,7 +1996,12 @@ destringize_and_run (cpp_reader *pfile, const 
> cpp_string *in,
>  int
>  _cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc)
>  {
> +  /* Make sure we don't invalidate the string token, if the closing 
> parenthesis
> +   ended up on a different line.  */
> +  ++pfile->keep_tokens;
>const cpp_token *string = get__Pragma_string (pfile);
> +  --pfile->keep_tokens;
> +
>pfile->directive_result.type = CPP_PADDING;
>
>if (string)


[PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-03-02 Thread Lewis Hyatt via Gcc-patches
The PR complains that we do not handle UTF-8 in the suffix for a user-defined
literal, such as:

bool operator ""_π (unsigned long long);

In fact we don't handle any extended identifier characters there, whether
UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
the "" tokens is included, since then the identifier is lexed in the "normal"
way as its own token. But when it is lexed as part of the string token, this
is handled in lex_string() with a one-off loop that is not aware of extended
characters.

This patch fixes it by adding a new function scan_cur_identifier() that can be
used to lex an identifier while in the middle of lexing another token.

BTW, the other place that has been mis-lexing identifiers is
lex_identifier_intern(), which is used to implement #pragma push_macro
and #pragma pop_macro. This does not support extended characters either.
I will add that in a subsequent patch, because it can't directly reuse the
new function, but rather needs to lex from a string instead of a cpp_buffer.

With scan_cur_identifier(), we do also correctly warn about bidi and
normalization issues in the extended identifiers comprising the suffix.

libcpp/ChangeLog:

PR preprocessor/103902
* lex.cc (identifier_diagnostics_on_lex): New function refactoring
some common code.
(lex_identifier_intern): Use the new function.
(lex_identifier): Don't run identifier diagnostics here, rather let
the call site do it when needed.
(_cpp_lex_direct): Adjust the call sites of lex_identifier ()
acccordingly.
(struct scan_id_result): New struct.
(scan_cur_identifier): New function.
(create_literal2): New function.
(lit_accum::create_literal2): New function.
(is_macro): Folded into new function...
(maybe_ignore_udl_macro_suffix): ...here.
(is_macro_not_literal_suffix): Folded likewise.
(lex_raw_string): Handle UTF-8 in UDL suffix via scan_cur_identifier ().
(lex_string): Likewise.

gcc/testsuite/ChangeLog:

PR preprocessor/103902
* g++.dg/cpp0x/udlit-extended-id-1.C: New test.
* g++.dg/cpp0x/udlit-extended-id-2.C: New test.
* g++.dg/cpp0x/udlit-extended-id-3.C: New test.
* g++.dg/cpp0x/udlit-extended-id-4.C: New test.
---

Notes:
Hello-

This is the updated version of the patch, incorporating feedback from Jakub
and Jason, most recently discussed here:

https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html

Please let me know how it looks? It is simpler than before with the new
approach. Thanks!

One thing to note. As Jason clarified for me, a usage like this:

 #pragma GCC poison _x
const char * operator "" _x (const char *, unsigned long);

The space between the "" and the _x is currently allowed but will be
deprecated in C++23. GCC currently will complain about the poisoned use of
_x in this case, and this patch, which is just focused on handling UTF-8
properly, does not change this. But it seems that it would be correct
not to apply poison in this case. I can try to follow up with a patch to do
so, if it seems worthwhile? Given the syntax is deprecated, maybe it's not
worth it...

For the time being, this patch does add a testcase for the above and xfails
it. For the case where no space is present, which is the part touched by the
present patch, existing behavior is preserved correctly and no diagnostics
such as poison are issued for the UDL suffix. (Contrary to v1 of this
patch.)

Thanks! bootstrap + regtested all languages on x86-64 Linux with
no regressions.

-Lewis

 .../g++.dg/cpp0x/udlit-extended-id-1.C|  68 
 .../g++.dg/cpp0x/udlit-extended-id-2.C|   6 +
 .../g++.dg/cpp0x/udlit-extended-id-3.C|  15 +
 .../g++.dg/cpp0x/udlit-extended-id-4.C|  14 +
 libcpp/lex.cc | 382 ++
 5 files changed, 317 insertions(+), 168 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-4.C

diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
new file mode 100644
index 000..411d4fdd0ba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
@@ -0,0 +1,68 @@
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-Wno-error=normalized" }
+#include 
+using namespace std;
+
+constexpr unsigned long long operator "" _π (unsigned long long x)
+{
+  return 3 * x;
+}
+
+/* Historically we didn't parse properly as part of the "" token, so check that
+   as well.  */
+constexpr unsigned long long operator ""_Π2 

Re: Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-02-15 Thread Lewis Hyatt via Gcc-patches
On Wed, Feb 15, 2023 at 1:39 PM Jason Merrill  wrote:
>
> On 9/26/22 15:27, Lewis Hyatt wrote:
> > On Wed, Jun 15, 2022 at 03:06:16PM -0400, Lewis Hyatt wrote:
> >> On Tue, Jun 14, 2022 at 05:26:49PM -0400, Lewis Hyatt wrote:
> >>> Hello-
> >>>
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902
> >>>
> >>> The attached patch resolves PR preprocessor/103902 as described in the 
> >>> patch
> >>> message inline below. bootstrap + regtest all languages was successful on
> >>> x86-64 Linux, with no new failures:
> >>>
> >>> FAIL 103 103
> >>> PASS 542338 542371
> >>> UNSUPPORTED 15247 15250
> >>> UNTESTED 136 136
> >>> XFAIL 4166 4166
> >>> XPASS 17 17
> >>>
> >>> Please let me know if it looks OK?
> >>>
> >>> A few questions I have:
> >>>
> >>> - A difference introduced with this patch is that after lexing something
> >>> like `operator ""_abc', then `_abc' is added to the identifier hash map,
> >>> whereas previously it was not. I feel like this must be OK because with 
> >>> the
> >>> optional space as in `operator "" _abc', it would be added with or 
> >>> without the
> >>> patch.
> >>>
> >>> - The behavior of `#pragma GCC poison' is not consistent (including prior 
> >>> to
> >>>my patch). I tried to make it more so but there is still one thing I 
> >>> want to
> >>>ask about. Leaving aside extended characters for now, the 
> >>> inconsistency is
> >>>that currently the poison is only checked, when the suffix appears as a
> >>>standalone token.
> >>>
> >>>#pragma GCC poison _X
> >>>bool operator ""_X (unsigned long long);   //accepted before the patch,
> >>>   //rejected after it
> >>>bool operator "" _X (unsigned long long);  //rejected either before or 
> >>> after
> >>>const char * operator ""_X (const char *, unsigned long); //accepted 
> >>> before,
> >>>  //rejected 
> >>> after
> >>>const char * operator "" _X (const char *, unsigned long); //rejected 
> >>> either
> >>>
> >>>const char * s = ""_X; //accepted before the patch, rejected after it
> >>>const bool b = 1_X; //accepted before or after 
> >>>
> >>> I feel like after the patch, the behavior is the expected behavior for all
> >>> cases but the last one. Here, we allow the poisoned identifier because 
> >>> it's
> >>> not lexed as an identifier, it's lexed as part of a pp-number. Does it 
> >>> seem OK
> >>> like this or does it need to be addressed?
> >>
> >> Sorry, that version actually did not handle the case of -Wc++11-compat in
> >> c++98 mode correctly. This updated version fixes that and adds the missing
> >> test coverage for that, if you could please review this one instead?
> >>
> >> By the way, the pipermail archive seems to permanently mangle UTF-8 in 
> >> inline
> >> attachments. I attached the patch also gzipped to address that for the
> >> archive, since the new testcases do use non-ASCII characters.
> >>
> >> Thanks for taking a look!
> >
> > Hello-
> >
> > May I please ping this patch again? Joseph suggested that it would be best 
> > if
> > a C++ maintainer has a look at it. This is one of just a few places left 
> > where
> > we don't handle UTF-8 properly in libcpp, it would be really nice to get 
> > them
> > fixed up if there is time to review this patch. Thanks!
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596704.html
> >
> > I re-attached it here as it required some trivial rebasing on top of 
> > recently
> > pushed changes. As before, I also attached the gzipped version so that the
> > UTF-8 testcases show up OK in the online archive, in case that's still an
> > issue. Thanks for taking a look!
>
> Thank you for the patch, sorry it slipped off my radar.
>

Thanks for taking a look at it. It's certainly an edge case that is
not bothering anyone too much, so no rush with it.

> > This patch fixes it by adding a new function scan_cur_identifier() t

Re: Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-02-10 Thread Lewis Hyatt via Gcc-patches
On Fri, Feb 10, 2023 at 11:30 AM Jakub Jelinek  wrote:
>
> On Mon, Sep 26, 2022 at 06:27:25PM -0400, Lewis Hyatt via Gcc-patches wrote:
> > May I please ping this patch again? Joseph suggested that it would be best 
> > if
> > a C++ maintainer has a look at it. This is one of just a few places left 
> > where
> > we don't handle UTF-8 properly in libcpp, it would be really nice to get 
> > them
> > fixed up if there is time to review this patch. Thanks!
>
> CCing them.
>
> Just some nits from me, but I agree C++ maintainers are the best reviewers
> for this.

Thanks so much for looking it over, I really appreciate it. I'll be
sure to incorporate all your feedback along with those from the full
review.

Is this for stage 1 at this point BTW?

One note, the patch as-is doesn't quite apply to master branch
nowadays, it just needs a small tweak since warn_about_normalization()
has acquired a new argument in the meantime. If it's helpful I can
resend it with this addressed, as well as the rest of your comments?

Finally one comment here:

> > +  if (const auto sr = scan_cur_identifier (pfile))
> > + {
> > +   /* If a string format macro, say from inttypes.h, is placed touching
> > +  a string literal it could be parsed as a C++11 user-defined
> > +  string literal thus breaking the program.  User-defined literals
> > +  outside of namespace std must start with a single underscore, so
> > +  assume anything of that form really is a UDL suffix.  We don't
> > +  need to worry about UDLs defined inside namespace std because
> > +  their names are reserved, so cannot be used as macro names in
> > +  valid programs.  */
> > +   if ((suffix_begin[0] != '_' || suffix_begin[1] == '_')
> > +   && cpp_macro_p (sr.node))
>
> What is the advantage of dropping is_macro_not_literal_suffix and
> hand-inlining it in two different spots?
> Couldn't even the actual warning be moved into an inline function?

The is_macro() function was doing two jobs, first lexing the
identifier and looking it up in the hash table, and then calling
cpp_macro_p(). This was a bit duplicative because the identifier was
then immediately lexed again after the check. Since lexing it became
more complicated with UTF-8 support, I changed it not to duplicate
that effort and instead scan_cur_identifer() does the job once. With
that done, all that's left for is_macro() to do is just the one line
check so I got rid of it. However, I agree that the check about
suffix_begin is not really trivial and so factoring this out into one
place instead of two makes sense. I'll try to move the whole warning
into its own function in the next iteration.

> Otherwise it looks reasonable to me, but I'd still prefer Jason or Nathan
> to review this.
>
> Jakub
>

Thanks again.

-Lewis


[PATCH] libcpp: Fix ICE on directive inside _Pragma() operator [PR67046]

2023-01-14 Thread Lewis Hyatt via Gcc-patches
get__Pragma_string() in directives.cc is responsible for lexing the parens
and the string argument from a _Pragma("...") operator. This function does
not handle the case when the closing paren is not on the same line as the
string; in that case, libcpp will by default reuse the token buffer it
previously used for the string, so that the string token returned by
get__Pragma_string() may be corrupted, as shown in the testcase. Fix using
the existing keep_tokens mechanism that temporarily disables the reuse of
token buffers.

libcpp/ChangeLog:

PR preprocessor/67046
* directives.cc (_cpp_do__Pragma): Increment pfile->keep_tokens to
ensure the returned string token is valid.

gcc/testsuite/ChangeLog:

PR preprocessor/67046
* c-c++-common/cpp/pr67046.c: New test.
---

Notes:
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67046

This fixes an old ICE in libcpp that can happen when lexing the tokens from 
a
_Pragma operator. Bootstrapped+tested on x86-64 Linux with no
regressions. Please let me know if it's OK? Thanks...

-Lewis

 gcc/testsuite/c-c++-common/cpp/pr67046.c | 10 ++
 libcpp/directives.cc |  5 +
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr67046.c

diff --git a/gcc/testsuite/c-c++-common/cpp/pr67046.c 
b/gcc/testsuite/c-c++-common/cpp/pr67046.c
new file mode 100644
index 000..f37f20c624e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr67046.c
@@ -0,0 +1,10 @@
+/* { dg-do preprocess } */
+
+_Pragma(
+"message(\"msg\")"
+)
+
+_Pragma(
+"message(\"msg\")"
+#
+)
diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 9dc4363c65a..ffd262bce7d 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -1996,7 +1996,12 @@ destringize_and_run (cpp_reader *pfile, const cpp_string 
*in,
 int
 _cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc)
 {
+  /* Make sure we don't invalidate the string token, if the closing parenthesis
+   ended up on a different line.  */
+  ++pfile->keep_tokens;
   const cpp_token *string = get__Pragma_string (pfile);
+  --pfile->keep_tokens;
+
   pfile->directive_result.type = CPP_PADDING;
 
   if (string)


ping: [PATCH] libcpp: Improve location for macro names [PR66290]

2023-01-12 Thread Lewis Hyatt via Gcc-patches
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607647.html
May I please ping this one again? It will enable closing out the PR. Thanks!

-Lewis

On Thu, Dec 1, 2022 at 9:22 AM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
>
> May I please ping this one? Thanks!
> I have also re-attached the rebased patch here.
>
> -Lewis
>
> On Wed, Oct 12, 2022 at 06:37:50PM -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> >
> > Since Jeff was kind enough to ack one of my other preprocessor patches
> > today, I have become emboldened to ping this one again too :). Would
> > anyone have some time to take a look at it please? Thanks!
> >
> > -Lewis
> >
> > On Thu, Sep 15, 2022 at 6:31 PM Lewis Hyatt  wrote:
> > >
> > > Hello-
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > > May I please ping this patch? Thank you.
> > >
> > > -Lewis
> > >
> > > On Fri, Aug 5, 2022 at 12:14 PM Lewis Hyatt  wrote:
> > > >
> > > >
> > > > When libcpp reports diagnostics whose locus is a macro name (such as for
> > > > -Wunused-macros), it uses the location in the cpp_macro object that was
> > > > stored by _cpp_new_macro. This is currently set to 
> > > > pfile->directive_line,
> > > > which contains the line number only and no column information. This 
> > > > patch
> > > > changes the stored location to the src_loc for the token defining the 
> > > > macro
> > > > name, which includes the location and range information.
> > > >
> > > > libcpp/ChangeLog:
> > > >
> > > > PR c++/66290
> > > > * macro.cc (_cpp_create_definition): Add location argument.
> > > > * internal.h (_cpp_create_definition): Adjust prototype.
> > > > * directives.cc (do_define): Pass new location argument to
> > > > _cpp_create_definition.
> > > > (do_undef): Stop passing inferior location to 
> > > > cpp_warning_with_line;
> > > > the default from cpp_warning is better.
> > > > (cpp_pop_definition): Pass new location argument to
> > > > _cpp_create_definition.
> > > > * pch.cc (cpp_read_state): Likewise.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR c++/66290
> > > > * c-c++-common/cpp/macro-ranges.c: New test.
> > > > * c-c++-common/cpp/line-2.c: Adapt to check for column 
> > > > information
> > > > on macro-related libcpp warnings.
> > > > * c-c++-common/cpp/line-3.c: Likewise.
> > > > * c-c++-common/cpp/macro-arg-count-1.c: Likewise.
> > > > * c-c++-common/cpp/pr58844-1.c: Likewise.
> > > > * c-c++-common/cpp/pr58844-2.c: Likewise.
> > > > * c-c++-common/cpp/warning-zero-location.c: Likewise.
> > > > * c-c++-common/pragma-diag-14.c: Likewise.
> > > > * c-c++-common/pragma-diag-15.c: Likewise.
> > > > * g++.dg/modules/macro-2_d.C: Likewise.
> > > > * g++.dg/modules/macro-4_d.C: Likewise.
> > > > * g++.dg/modules/macro-4_e.C: Likewise.
> > > > * g++.dg/spellcheck-macro-ordering.C: Likewise.
> > > > * gcc.dg/builtin-redefine.c: Likewise.
> > > > * gcc.dg/cpp/Wunused.c: Likewise.
> > > > * gcc.dg/cpp/redef2.c: Likewise.
> > > > * gcc.dg/cpp/redef3.c: Likewise.
> > > > * gcc.dg/cpp/redef4.c: Likewise.
> > > > * gcc.dg/cpp/ucnid-11-utf8.c: Likewise.
> > > > * gcc.dg/cpp/ucnid-11.c: Likewise.
> > > > * gcc.dg/cpp/undef2.c: Likewise.
> > > > * gcc.dg/cpp/warn-redefined-2.c: Likewise.
> > > > * gcc.dg/cpp/warn-redefined.c: Likewise.
> > > > * gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
> > > > * gcc.dg/cpp/warn-unused-macros.c: Likewise.
> > > > ---
> > > >
> > > > Notes:
> > > > Hello-
> > > >
> > > > The PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66290) was 
> > > > originally
> > > > about the entirely wrong location for -Wunused-macros in C++ mode, 
>

[PATCH v2 3/4] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings

2023-01-05 Thread Lewis Hyatt via Gcc-patches
Currently, the tokens obtained from a destringified _Pragma string do not get
assigned proper locations while they are being lexed.  After the tokens have
been obtained, they are reassigned the same location as the _Pragma token,
which is sufficient to make things like _Pragma("GCC diagnostic ignored...")
operate correctly, but this still results in inferior diagnostics, since the
diagnostics do not point to the problematic tokens.  Further, if a diagnostic
is issued by libcpp during the lexing of the tokens, as opposed to being
issued by the frontend during the processing of the pragma, then the
patched-up location is not yet in place, and the user rather sees an invalid
location that is near to the location of the _Pragma string in some cases, or
potentially very far away, depending on the macro expansion history.  For
example:

=
_Pragma("GCC diagnostic ignored \"oops")
=

produces the diagnostic:

file.cpp:1:24: warning: missing terminating " character
1 | _Pragma("GCC diagnostic ignored \"oops")
  |^

with the caret in a nonsensical location, while this one:

=
 #define S "GCC diagnostic ignored \"oops"
_Pragma(S)
=

produces:

file.cpp:2:24: warning: missing terminating " character
2 | _Pragma(S)
  |^

with both the caret in a nonsensical location, and the actual relevant context
completely absent.

Fix this by assigning proper locations using the new LC_GEN type of linemap.
Now the tokens are given locations inside a generated content buffer, and the
macro expansion stack is modified to be aware that these tokens logically
belong to the "expansion" of the _Pragma directive. For the above examples we
now output:

==
In buffer generated from file.cpp:1:
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:1:1: note: in <_Pragma directive>
1 | _Pragma("GCC diagnostic ignored \"oops")
  | ^~~
==

and

==
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:2:1: note: in <_Pragma directive>
2 | _Pragma(S)
  | ^~~
==

So that carets are pointing to something meaningful and all relevant context
appears in the diagnostic.  For the second example, it would be nice if the
macro expansion also output "in expansion of macro S", however doing that for
a general case of macro expansions makes the logic very complicated, since it
has to be done after the fact when the macro maps have already been
constructed.  It doesn't seem worth it for this case, given that the _Pragma
string has already been output once on the first line.

gcc/ChangeLog:

* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Add awareness
of _Pragma directive to the macro expansion trace.

libcpp/ChangeLog:

* directives.cc (get_token_no_padding): Add argument to receive the
virtual location of the token.
(get__Pragma_string): Likewise.
(do_pragma): Set pfile->directive_result->src_loc properly, it should
not be a virtual location.
(destringize_and_run): Update to provide proper locations for the
_Pragma string tokens.  Support raw strings.
(_cpp_do__Pragma): Adapt to changes to the helper functions.
* errors.cc (cpp_diagnostic_at): Support
cpp_reader::diagnostic_rebase_loc.
(cpp_diagnostic_with_line): Likewise.
* include/line-map.h (class rich_location): Add new member
forget_cached_expanded_locations().
* internal.h (struct _cpp__Pragma_state): Define new struct.
(_cpp_rebase_diagnostic_location): Declare new function.
(struct cpp_reader): Add diagnostic_rebase_loc member.
(_cpp_push__Pragma_token_context): Declare new function.
(_cpp_do__Pragma): Adjust prototype.
* macro.cc (pragma_str): New static var.
(builtin_macro): Adapt to new implementation of _Pragma processing.
(_cpp_pop_context): Fix the logic for resetting
pfile->top_most_macro_node, which previously was never triggered,
although the error seems to have been harmless.
(_cpp_push__Pragma_token_context): New function.
(_cpp_rebase_diagnostic_location): New function.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (token_streamer::stream): Pass the virtual location of
the _Pragma token to maybe_print_line(), not the spelling location.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adjust for new
macro tracking output for _Pragma directives.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/diagnostic-pragma-1.c: Adjust for new macro
tracking output for _Pragma directives.
* c-c++-common/cpp/pr57580.c: Likewise.
* c-c++-common/gomp/pragma-3.c: Likewise.

[PATCH v2 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-01-05 Thread Lewis Hyatt via Gcc-patches
Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The actual change needed to the line-maps API in libcpp is not too large and
requires no space overhead in the line map data structures (on 64-bit systems
that is; one newly added data member to class line_map_ordinary sits inside
former padding bytes.) An LC_GEN map is just an ordinary map like any other,
but the TO_FILE member that normally points to the file name points instead to
the actual data.  This works automatically with PCH as well, for the same
reason that the file name makes its way into a PCH.  In order to avoid
confusion, the member has been renamed from TO_FILE to DATA, and associated
accessors adjusted.

Outside libcpp, there are many small changes but most of them are to
selftests, which are necessarily more sensitive to implementation
details. From the perspective of the user (the "user", here, being a frontend
using line maps or else the diagnostics infrastructure), the chief visible
change is that the function location_get_source_line() should be passed an
expanded_location object instead of a separate filename and line number.  This
is not a big change because in most cases, this information came anyway from a
call to expand_location and the needed expanded_location object is readily
available. The new overload of location_get_source_line() uses the extra
information in the expanded_location object to obtain the data from the
in-memory buffer when it originated from an LC_GEN map.

Until the subsequent patch that starts using LC_GEN maps, none are yet
generated within GCC, hence nothing is added to the testsuite here; but all
relevant selftests have been extended to cover generated data maps in addition
to normal files.

libcpp/ChangeLog:

* include/line-map.h (enum lc_reason): Add LC_GEN.
(struct line_map_ordinary): Add new members to support LC_GEN concept.
(ORDINARY_MAP_FILE_NAME): Assert that map really does encode a file
and not generated data.
(ORDINARY_MAP_GENERATED_DATA_P): New function.
(ORDINARY_MAP_GENERATED_DATA): New function.
(ORDINARY_MAP_GENERATED_DATA_LEN): New function.
(ORDINARY_MAP_FILE_NAME_OR_DATA): New function.
(ORDINARY_MAPS_SAME_FILE_P): Declare new function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): Declare new function.
(LINEMAP_FILE): This was always a synonym for ORDINARY_MAP_FILE_NAME;
make this explicit.
(linemap_get_file_highest_location): Adjust prototype.
(linemap_add): Adjust prototype.
(class expanded_location): Add new members to store generated content.
* line-map.cc (ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
(ORDINARY_MAPS_SAME_FILE_P): New function.
(linemap_add): Add new argument DATA_LEN. Support generated data in
LC_GEN maps.
(linemap_check_files_exited): Adapt to API changes supporting LC_GEN.
(linemap_line_start): Likewise.
(linemap_position_for_loc_and_offset): Likewise.
(linemap_get_expansion_filename): Likewise.
(linemap_expand_location): Likewise.
(linemap_dump): Likewise.
(linemap_dump_location): Likewise.
(linemap_get_file_highest_location): Likewise.
* directives.cc (_cpp_do_file_change): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (make_range): Initialize new fields in
expanded_location.
(compatible_locations_p): Use new ORDINARY_MAPS_SAME_FILE_P ()
function.
(layout::calculate_x_offset_display): Use the new expanded_location
overload of location_get_source_line(), so as to support LC_GEN maps.
(layout::print_line): Likewise.
(source_line::source_line): Likewise.
(line_corrections::add_hint): Likewise.
(class line_corrections): Store the location as an exploc rather than
individual filename, so as to support LC_GEN maps.
(layout::print_trailing_fixits): Use the new exploc constructor for
class line_corrections.
(test_layout_x_offset_display_utf8): Test LC_GEN maps as well as normal.
(test_layout_x_offset_display_tab): Likewise.
(test_diagnostic_show_locus_one_liner): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Likewise.
(test_add_location_if_nearby): Likewise.
(test_diagnostic_show_locus_fixit_lines): Likewise.
(test_fixit_consolidation): Likewise.
(test_overlapped_fixit_printing): Likewise.

[PATCH v2 4/4] diagnostics: Support generated data locations in SARIF output

2023-01-05 Thread Lewis Hyatt via Gcc-patches
The diagnostics routines for SARIF output need to read the source code back
in, so that they can generate "snippet" and "content" records, so they need to
be able to cope with generated data locations.  Add support for that in
diagnostic-format-sarif.cc.

gcc/ChangeLog:

* diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New function.
(sarif_builder::maybe_make_physical_location_object): Support
generated data locations.
(sarif_builder::make_artifact_location_object): Likewise.
(sarif_builder::maybe_make_region_object_for_context): Likewise.
(sarif_builder::make_artifact_object): Likewise.
(sarif_builder::maybe_make_artifact_content_object): Likewise.
(get_source_lines): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/diagnostic-format-sarif-file-5.c: New test.
---
 gcc/diagnostic-format-sarif.cc| 102 +++---
 .../diagnostic-format-sarif-file-5.c  |  31 ++
 2 files changed, 93 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index f8fdd586ff0..99aba1414ea 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -125,7 +125,10 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+
+  typedef std::pair filename_or_buffer;
+  json::object *make_artifact_location_object (filename_or_buffer fb);
+
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -146,16 +149,17 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) 
const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
-   int start_line,
+  json::object *make_artifact_object (filename_or_buffer fb);
+  json::object *
+  maybe_make_artifact_content_object (filename_or_buffer fb) const;
+  json::object *maybe_make_artifact_content_object (expanded_location xloc,
int end_line) const;
   json::object *make_fix_object (const rich_location _loc);
   json::object *make_artifact_change_object (const rich_location );
   json::object *make_replacement_object (const fixit_hint ) const;
   json::object *make_artifact_content_object (const char *text) const;
   int get_sarif_column (expanded_location exploc) const;
+  static filename_or_buffer xloc_to_fb (expanded_location xloc);
 
   diagnostic_context *m_context;
 
@@ -166,7 +170,11 @@ private:
  diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set  m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+ with that length, not a filename.  */
+  hash_set ,
+  int_hash  >
+   > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set  m_rule_id_set;
   json::array *m_rules_arr;
@@ -588,6 +596,15 @@ sarif_builder::make_location_object (const 
diagnostic_event )
   return location_obj;
 }
 
+/* Populate a filename_or_buffer pair from an expanded location.  */
+sarif_builder::filename_or_buffer
+sarif_builder::xloc_to_fb (expanded_location xloc)
+{
+  if (xloc.generated_data_len)
+return filename_or_buffer (xloc.generated_data, xloc.generated_data_len);
+  return filename_or_buffer (xloc.file, 0);
+}
+
 /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC,
or return NULL;
Add any filename to the m_artifacts.  */
@@ -603,7 +620,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  m_filenames.add (xloc_to_fb (expand_location (loc)));
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -627,7 +644,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (xloc_to_fb (expand_location (loc)));
 }
 
 /* 

[PATCH v2 2/4] diagnostics: Handle generated data locations in edit_context

2023-01-05 Thread Lewis Hyatt via Gcc-patches
Class edit_context handles outputting fixit hints in diff form that could be
manually or automatically applied by the user. This will not make sense for
generated data locations, such as the contents of a _Pragma string, because
the text to be modified does not appear in the user's input files. We do not
currently ever generate fixit hints in such a context, but for future-proofing
purposes, ignore such locations in edit context now.

gcc/ChangeLog:

* edit-context.cc (edit_context::apply_fixit): Ignore locations in
generated data.
---
 gcc/edit-context.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
index 6f5bc6b9d8f..ae11b6f2e00 100644
--- a/gcc/edit-context.cc
+++ b/gcc/edit-context.cc
@@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint *hint)
 return false;
   if (start.column == 0)
 return false;
+  if (start.generated_data)
+return false;
   if (next_loc.column == 0)
 return false;
+  if (next_loc.generated_data)
+return false;
 
   edited_file  = get_or_insert_file (start.file);
   if (!m_valid)


[PATCH v2 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens

2023-01-05 Thread Lewis Hyatt via Gcc-patches
Hello-

This series contains the four remaining patches in the series originally
sent here:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605029.html

which implements improved locations for tokens lexed from a string inside a
_Pragma directive.

v2 1/4: diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

This was formerly v1 4/6. It has been rewritten in line with that review,
most recently discussed here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606616.html

v2 2/4: diagnostics: Handle generated data locations in edit_context

This was formerly v1 5a/6. It has been approved already conditional on
v2 1/4 as a prerequisite.

v2 3/4: diagnostics: libcpp: Assign real locations to the tokens inside
_Pragma strings

This was formerly v1 6/6 and is unchanged from that one. It has not been
reviewed yet.

v2 4/4: diagnostics: Support generated data locations in SARIF output

This was formerly v1 5c/6. It has not been fully reviewed yet.

Thanks for taking a look!

-Lewis


Re: [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-01-05 Thread Lewis Hyatt via Gcc-patches
On Thu, Nov 17, 2022 at 4:21 PM Lewis Hyatt  wrote:
>
> On Sat, Nov 05, 2022 at 12:23:28PM -0400, David Malcolm wrote:
> > On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> > [...snip...]
> > >
> > > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > > index 5890c18bdc3..2935d7fb236 100644
> > > --- a/gcc/c-family/c-common.cc
> > > +++ b/gcc/c-family/c-common.cc
> > > @@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const 
> > > char *file, location_t loc)
> > >const line_map_ordinary *ord_map
> > > = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
> > >
> > > +  if (ord_map->reason == LC_GEN)
> > > +   continue;
> > > +
> > >if (const line_map_ordinary *from
> > >   = linemap_included_from_linemap (line_table, ord_map))
> > > /* We cannot use pointer equality, because with preprocessed
> > >input all filename strings are unique.  */
> > > -   if (0 == strcmp (from->to_file, file))
> > > +   if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
> > >   {
> > > last_include_ord_map = from;
> > > last_ord_map_after_include = NULL;
> >
> > [...snip...]
> >
> > I'm not a fan of having the "to_file" field change meaning based on
> > whether reason is LC_GEN.
> >
> > How involved would it be to split line_map_ordinary into two
> > subclasses, so that we'd have this hierarchy (with indentation showing
> > inheritance):
> >
> > line_map
> >   line_map_ordinary
> > line_map_ordinary_file
> > line_map_ordinary_generated
> >   line_map_macro
> >
> > Alternatively, how about renaming "to_file" to be "data" (or "m_data"),
> > to emphasize that it might not be a filename, and that we have to check
> > everywhere we access that field.
> >
> > Please can all those checks for LC_GEN go into an inline function so we
> > can write e.g.
> >   map->generated_p ()
> > or somesuch.
> >
> > If I reading things right, patch 6 adds the sole usage of this in
> > destringize_and_run.  Would we ever want to discriminate between
> > different kinds of generated buffers?
> >
> > [...snip...]
> >
> > > @@ -796,10 +798,13 @@ diagnostic_report_current_module 
> > > (diagnostic_context *context, location_t where)
> > >  N_("of module"),
> > >  N_("In module imported at"),   /* 6 */
> > >  N_("imported at"),
> > > +N_("In buffer generated from"),   /* 8 */
> > > };
> >
> > We use the wording "destringized" in:
> >
> > so maybe this should be "In buffer destringized from" ???  (I'm not
> > sure)
> >
> > [...snip...]
> >
> > > diff --git a/gcc/input.cc b/gcc/input.cc
> > > index 483cb6e940d..3cf5480551d 100644
> > > --- a/gcc/input.cc
> > > +++ b/gcc/input.cc
> >
> > [..snip...]
> >
> > > @@ -58,7 +64,7 @@ public:
> > >~file_cache_slot ();
> >
> > My initial thought reading the input.cc part of this patch was that I
> > want it to be very clear when a file_cache_slot is for a real file vs
> > when we're replaying generated data.  I'd hoped that this could have
> > been expressed via inheritance, but we preallocate all the cache slots
> > once in an array in file_cache's ctor and the slots get reused over
> > time.  So instead of that, can we please have some kind of:
> >
> >bool file_slot_p () const;
> >bool generated_slot_p () const;
> >
> > or somesuch, so that we can have clear assertions and conditionals
> > about the current state of a slot (I think the discriminating condition
> > is that generated_data_len > 0, right?)
> >
> > If I'm reading things right, it looks like file_cache_slot::m_file_path
> > does double duty after this patch, and is either a filename, or a
> > pointer to the generated data.  If so, please can the patch rename it,
> > and have all usage guarded appropriately.  Can it be a union? (or does
> > the ctor prevent that?)
> >
> > [...snip...]
> >
> > > @@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned 
> > > *highest_use_count)
> > > num_file_slots files are cached.  */
> &

[PATCH] preprocessor: Don't register pragmas in directives-only mode [PR108244]

2022-12-30 Thread Lewis Hyatt via Gcc-patches
libcpp's directives-only mode does not expect deferred pragmas to be
registered, but to date the c-family registration process has not checked for
this case. That issue became more visible since r13-1544, which added the
commonly used GCC diagnostic pragmas to the set of those registered in
preprocessing modes. Fix it by checking for directives-only mode in
c-family/c-pragma.cc.

gcc/c-family/ChangeLog:

PR preprocessor/108244
* c-pragma.cc (c_register_pragma_1): Don't attempt to register any
deferred pragmas if -fdirectives-only.
(init_pragma): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/pr108244-1.c: New test.
* c-c++-common/cpp/pr108244-2.c: New test.
* c-c++-common/cpp/pr108244-3.c: New test.
---

Notes:
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108244

The PR notes a regression in GCC 13 which is fixed by the attached
patch. Bootstrap+regtest all languages on x86-64 Linux looks good. Please 
let
me know if it is OK? Thanks.

-Lewis

 gcc/c-family/c-pragma.cc| 54 -
 gcc/testsuite/c-c++-common/cpp/pr108244-1.c |  5 ++
 gcc/testsuite/c-c++-common/cpp/pr108244-2.c |  5 ++
 gcc/testsuite/c-c++-common/cpp/pr108244-3.c |  6 +++
 4 files changed, 46 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr108244-1.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr108244-2.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pr108244-3.c

diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc
index 142a46441ac..91fabf0a513 100644
--- a/gcc/c-family/c-pragma.cc
+++ b/gcc/c-family/c-pragma.cc
@@ -1647,7 +1647,8 @@ c_register_pragma_1 (const char *space, const char *name,
 
   if (flag_preprocess_only)
 {
-  if (!(allow_expansion || ihandler.early_handler.handler_1arg))
+  if (cpp_get_options (parse_in)->directives_only
+ || !(allow_expansion || ihandler.early_handler.handler_1arg))
return;
 
   pragma_pp_data pp_data;
@@ -1811,34 +1812,39 @@ c_pp_invoke_early_pragma_handler (unsigned int id)
 void
 init_pragma (void)
 {
-  if (flag_openacc)
+
+  if (!cpp_get_options (parse_in)->directives_only)
 {
-  const int n_oacc_pragmas = ARRAY_SIZE (oacc_pragmas);
-  int i;
+  if (flag_openacc)
+   {
+ const int n_oacc_pragmas = ARRAY_SIZE (oacc_pragmas);
+ int i;
 
-  for (i = 0; i < n_oacc_pragmas; ++i)
-   cpp_register_deferred_pragma (parse_in, "acc", oacc_pragmas[i].name,
- oacc_pragmas[i].id, true, true);
-}
+ for (i = 0; i < n_oacc_pragmas; ++i)
+   cpp_register_deferred_pragma (parse_in, "acc", oacc_pragmas[i].name,
+ oacc_pragmas[i].id, true, true);
+   }
 
-  if (flag_openmp)
-{
-  const int n_omp_pragmas = ARRAY_SIZE (omp_pragmas);
-  int i;
+  if (flag_openmp)
+   {
+ const int n_omp_pragmas = ARRAY_SIZE (omp_pragmas);
+ int i;
 
-  for (i = 0; i < n_omp_pragmas; ++i)
-   cpp_register_deferred_pragma (parse_in, "omp", omp_pragmas[i].name,
- omp_pragmas[i].id, true, true);
-}
-  if (flag_openmp || flag_openmp_simd)
-{
-  const int n_omp_pragmas_simd = sizeof (omp_pragmas_simd)
-/ sizeof (*omp_pragmas);
-  int i;
+ for (i = 0; i < n_omp_pragmas; ++i)
+   cpp_register_deferred_pragma (parse_in, "omp", omp_pragmas[i].name,
+ omp_pragmas[i].id, true, true);
+   }
+  if (flag_openmp || flag_openmp_simd)
+   {
+ const int n_omp_pragmas_simd
+   = sizeof (omp_pragmas_simd) / sizeof (*omp_pragmas);
+ int i;
 
-  for (i = 0; i < n_omp_pragmas_simd; ++i)
-   cpp_register_deferred_pragma (parse_in, "omp", omp_pragmas_simd[i].name,
- omp_pragmas_simd[i].id, true, true);
+ for (i = 0; i < n_omp_pragmas_simd; ++i)
+   cpp_register_deferred_pragma (parse_in, "omp",
+ omp_pragmas_simd[i].name,
+ omp_pragmas_simd[i].id, true, true);
+   }
 }
 
   if (!flag_preprocess_only)
diff --git a/gcc/testsuite/c-c++-common/cpp/pr108244-1.c 
b/gcc/testsuite/c-c++-common/cpp/pr108244-1.c
new file mode 100644
index 000..1678004a4d9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr108244-1.c
@@ -0,0 +1,5 @@
+/* { dg-do preprocess } */
+/* { dg-additional-options "-fdirectives-only" } */
+#pragma GCC diagnostic push
+#ifdef t
+#endif
diff --git a/gcc/testsuite/c-c++-common/cpp/pr108244-2.c 
b/gcc/testsuite/c-c++-common/cpp/pr108244-2.c
new file mode 100644
index 000..017682ad186
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pr108244-2.c
@@ -0,0 +1,5 @@
+/* { dg-do preprocess } */
+/* { 

Ping^3: [PATCH] libcpp: Improve location for macro names [PR66290]

2022-12-01 Thread Lewis Hyatt via Gcc-patches
Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html

May I please ping this one? Thanks!
I have also re-attached the rebased patch here.

-Lewis

On Wed, Oct 12, 2022 at 06:37:50PM -0400, Lewis Hyatt wrote:
> Hello-
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> 
> Since Jeff was kind enough to ack one of my other preprocessor patches
> today, I have become emboldened to ping this one again too :). Would
> anyone have some time to take a look at it please? Thanks!
> 
> -Lewis
> 
> On Thu, Sep 15, 2022 at 6:31 PM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > May I please ping this patch? Thank you.
> >
> > -Lewis
> >
> > On Fri, Aug 5, 2022 at 12:14 PM Lewis Hyatt  wrote:
> > >
> > >
> > > When libcpp reports diagnostics whose locus is a macro name (such as for
> > > -Wunused-macros), it uses the location in the cpp_macro object that was
> > > stored by _cpp_new_macro. This is currently set to pfile->directive_line,
> > > which contains the line number only and no column information. This patch
> > > changes the stored location to the src_loc for the token defining the 
> > > macro
> > > name, which includes the location and range information.
> > >
> > > libcpp/ChangeLog:
> > >
> > > PR c++/66290
> > > * macro.cc (_cpp_create_definition): Add location argument.
> > > * internal.h (_cpp_create_definition): Adjust prototype.
> > > * directives.cc (do_define): Pass new location argument to
> > > _cpp_create_definition.
> > > (do_undef): Stop passing inferior location to 
> > > cpp_warning_with_line;
> > > the default from cpp_warning is better.
> > > (cpp_pop_definition): Pass new location argument to
> > > _cpp_create_definition.
> > > * pch.cc (cpp_read_state): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR c++/66290
> > > * c-c++-common/cpp/macro-ranges.c: New test.
> > > * c-c++-common/cpp/line-2.c: Adapt to check for column information
> > > on macro-related libcpp warnings.
> > > * c-c++-common/cpp/line-3.c: Likewise.
> > > * c-c++-common/cpp/macro-arg-count-1.c: Likewise.
> > > * c-c++-common/cpp/pr58844-1.c: Likewise.
> > > * c-c++-common/cpp/pr58844-2.c: Likewise.
> > > * c-c++-common/cpp/warning-zero-location.c: Likewise.
> > > * c-c++-common/pragma-diag-14.c: Likewise.
> > > * c-c++-common/pragma-diag-15.c: Likewise.
> > > * g++.dg/modules/macro-2_d.C: Likewise.
> > > * g++.dg/modules/macro-4_d.C: Likewise.
> > > * g++.dg/modules/macro-4_e.C: Likewise.
> > > * g++.dg/spellcheck-macro-ordering.C: Likewise.
> > > * gcc.dg/builtin-redefine.c: Likewise.
> > > * gcc.dg/cpp/Wunused.c: Likewise.
> > > * gcc.dg/cpp/redef2.c: Likewise.
> > > * gcc.dg/cpp/redef3.c: Likewise.
> > > * gcc.dg/cpp/redef4.c: Likewise.
> > > * gcc.dg/cpp/ucnid-11-utf8.c: Likewise.
> > > * gcc.dg/cpp/ucnid-11.c: Likewise.
> > > * gcc.dg/cpp/undef2.c: Likewise.
> > > * gcc.dg/cpp/warn-redefined-2.c: Likewise.
> > > * gcc.dg/cpp/warn-redefined.c: Likewise.
> > > * gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
> > > * gcc.dg/cpp/warn-unused-macros.c: Likewise.
> > > ---
> > >
> > > Notes:
> > > Hello-
> > >
> > > The PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66290) was 
> > > originally
> > > about the entirely wrong location for -Wunused-macros in C++ mode, 
> > > which
> > > behavior was fixed by r13-1903, but before closing it out I wanted to 
> > > also
> > > address a second point brought up in the PR comments, namely that we 
> > > do not
> > > include column information when emitting diagnostics for macro names, 
> > > such as
> > > is done for -Wunused-macros. The attached patch updates the location 
> > > stored in
> > > the cpp_macro object so that it includes the column and range 
> > > information for
> > > the token comprising the macro name; previously, the location was 
> > > just the
> > >  

  1   2   3   >