Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-23 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 03:39:40PM -0400, David Malcolm wrote:
> On Tue, 2023-08-15 at 13:58 -0400, Lewis Hyatt wrote:
> > On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote:
> > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > > Class file_cache_slot in input.cc is used to query specific lines
> > > > of source
> > > > code from a file when needed by diagnostics infrastructure. This
> > > > will be
> > > > extended in a subsequent patch to support obtaining the source
> > > > code from
> > > > in-memory generated buffers rather than from a file. The present
> > > > patch
> > > > refactors class file_cache_slot, putting most of the logic into a
> > > > new base
> > > > class cache_data_source, in preparation for reusing that code in
> > > > the next
> > > > patch. There is no change in functionality yet.
> > > > 
> 
> [...snip...]
> 
> > > 
> > > I confess I had to reread both this and patch 4/8 to make sense of
> > > this; this is probably one of those cases where it's harder to read
> > > in
> > > patch form than as source, but I think I now understand the new
> > > implementation.
> > 
> > Yes, sorry about that. I hope at least splitting into two patches
> > here made it
> > a little easier.
> > 
> > > 
> > > Did you try testing this with valgrind (e.g. "make selftest-
> > > valgrind")?
> > > 
> > 
> > Oh interesting, was not aware of this. I think it shows that new
> > leaks were
> > not introduced with the patch series.
> > 
> 
> [...snip...]
> 
> > 
> > 
> > > I don't think we have any selftest coverage for "\r" in the line-
> > > break
> > > handling; that would be good to add.
> > > 
> > > This patch is OK for trunk once the rest of the kit is approved.
> > 
> > Thank you. To be clear, were you suggesting to add selftest coverage
> > for \r
> > endings now, or in a follow up?
> 
> The former, please, so that we can sure that the patch doesn't
> introduce any buffer overreads etc.
> 
> Thanks
> Dave
>

The following (incremental to patch 5/8 or after) adds selftest coverage for
alternate line endings. I hope things aren't too unclear this way; I can
resend updated versions of some or all of the patches from scratch, if useful.

AFAIK this is the current status of things:

Patch 1/8: Reviewed, updated version incorporating feedback has not been acked
yet, at: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627250.html

Patch 2/8: OKed, pending tweak to reject fixit hints in generated data, which
was sent incrementally here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627405.html

Patch 3/8: OKed, pending new selftest attached to this email.

Patch 4/8: OKed, pending tweak to assert on non-NULL buffers which was sent
incrementally here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628283.html

Patch 5/8: OKed

Patch 6/8: OKed

Patch 7/8: Not reviewed yet

Patch 8/8: Waiting additional feedback from you, perhaps SARIF need not worry
about this for now and should just ignore generated data locations.

Thanks again for taking the time to go through this, I hope it will prove
worth it.

-Lewis

-- >8 --

gcc/ChangeLog:

* input.cc (test_reading_source_line): Test additional cases,
including generated data and alternate line endings.
(input_cc_tests): Adapt to test_reading_source_line() changes.

diff --git a/gcc/input.cc b/gcc/input.cc
index 4c99df7a205..72274732c6c 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -2392,30 +2392,51 @@ test_make_location_nonpure_range_endpoints (const 
line_table_case _)
 /* Verify reading of input files (e.g. for caret-based diagnostics).  */
 
 static void
-test_reading_source_line ()
+test_reading_source_line (bool generated, const char *e1, const char *e2)
 {
   /* Create a tempfile and write some text to it.  */
+  const char *line1 = "01234567890123456789";
+  const char *line2 = "This is the test text";
+  const char *line3 = "This is the 3rd line";
+  char content[72];
+  const int content_len = snprintf (content, sizeof (content),
+   "%s%s%s%s%s",
+   line1, e1, line2, e2, line3);
+  ASSERT_LT (content_len, (int)sizeof (content));
   temp_source_file tmp (SELFTEST_LOCATION, ".txt",
-   "01234567890123456789\n"
-   "This is the test text\n"
-   "This is the 3rd line");
+   content, content_len, generated);
 
-  /* Read back a specific line from the tempfile.  */
-  char_span source_line = location_get_source_line (tmp.get_filename (), 3);
+  /* Read back some specific lines from the tempfile, not all in order.  */
+  const source_id src = generated
+? source_id (tmp.content_buf, tmp.content_len)
+: source_id (tmp.get_filename ());
+
+  char_span source_line = location_get_source_line (src, 1);
+  ASSERT_TRUE (source_line);
+  ASSERT_TRUE (source_line.get_buffer () != NULL);
+  /* N.B. If the line terminator is \r\n, the returned char_span will 

Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-15 Thread David Malcolm via Gcc-patches
On Tue, 2023-08-15 at 13:58 -0400, Lewis Hyatt wrote:
> On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote:
> > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > Class file_cache_slot in input.cc is used to query specific lines
> > > of source
> > > code from a file when needed by diagnostics infrastructure. This
> > > will be
> > > extended in a subsequent patch to support obtaining the source
> > > code from
> > > in-memory generated buffers rather than from a file. The present
> > > patch
> > > refactors class file_cache_slot, putting most of the logic into a
> > > new base
> > > class cache_data_source, in preparation for reusing that code in
> > > the next
> > > patch. There is no change in functionality yet.
> > > 

[...snip...]

> > 
> > I confess I had to reread both this and patch 4/8 to make sense of
> > this; this is probably one of those cases where it's harder to read
> > in
> > patch form than as source, but I think I now understand the new
> > implementation.
> 
> Yes, sorry about that. I hope at least splitting into two patches
> here made it
> a little easier.
> 
> > 
> > Did you try testing this with valgrind (e.g. "make selftest-
> > valgrind")?
> > 
> 
> Oh interesting, was not aware of this. I think it shows that new
> leaks were
> not introduced with the patch series.
> 

[...snip...]

> 
> 
> > I don't think we have any selftest coverage for "\r" in the line-
> > break
> > handling; that would be good to add.
> > 
> > This patch is OK for trunk once the rest of the kit is approved.
> 
> Thank you. To be clear, were you suggesting to add selftest coverage
> for \r
> endings now, or in a follow up?

The former, please, so that we can sure that the patch doesn't
introduce any buffer overreads etc.

Thanks
Dave



Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-15 Thread Lewis Hyatt via Gcc-patches
On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote:
> On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > Class file_cache_slot in input.cc is used to query specific lines of source
> > code from a file when needed by diagnostics infrastructure. This will be
> > extended in a subsequent patch to support obtaining the source code from
> > in-memory generated buffers rather than from a file. The present patch
> > refactors class file_cache_slot, putting most of the logic into a new base
> > class cache_data_source, in preparation for reusing that code in the next
> > patch. There is no change in functionality yet.
> > 
> > gcc/ChangeLog:
> > 
> > * input.cc (class file_cache_slot): Refactor functionality into a
> > new base class...
> > (class cache_data_source): ...here.
> > (file_cache::forcibly_evict_file): Adapt for refactoring.
> > (file_cache_slot::evict): Renamed to...
> > (file_cache_slot::reset): ...this, and partially refactored into
> > base class...
> > (cache_data_source::reset): ...here.
> > (file_cache_slot::get_full_file_content): Moved into base class...
> > (cache_data_source::get_full_file_content): ...here.
> > (file_cache_slot::create): Adapt for refactoring.
> > (file_cache_slot::file_cache_slot): Refactor partially into...
> > (cache_data_source::cache_data_source): ...here.
> > (file_cache_slot::~file_cache_slot): Refactor partially into...
> > (cache_data_source::~cache_data_source): ...here.
> > (file_cache_slot::needs_read_p): Remove.
> > (file_cache_slot::needs_grow_p): Remove.
> > (file_cache_slot::maybe_grow): Adapt for refactoring.
> > (file_cache_slot::read_data): Refactored, along with...
> > (file_cache_slot::maybe_read_data): this, into...
> > (file_cache_slot::get_more_data): ...here.
> > (find_end_of_line): Change interface to take a pair of pointers,
> > rather than a pointer + length.
> > (file_cache_slot::get_next_line): Refactored into...
> > (cache_data_source::get_next_line): ...here.
> > (file_cache_slot::goto_next_line): Refactored into...
> > (cache_data_source::goto_next_line): ...here.
> > (file_cache_slot::read_line_num): Refactored into...
> > (cache_data_source::read_line_num): ...here.
> > (location_get_source_line): Fix const-correctness as necessitated by
> > new interface.
> > ---
> >  gcc/input.cc | 513 +++
> >  1 file changed, 235 insertions(+), 278 deletions(-)
> > 
> 
> I confess I had to reread both this and patch 4/8 to make sense of
> this; this is probably one of those cases where it's harder to read in
> patch form than as source, but I think I now understand the new
> implementation.

Yes, sorry about that. I hope at least splitting into two patches here made it
a little easier.

> 
> Did you try testing this with valgrind (e.g. "make selftest-valgrind")?
>

Oh interesting, was not aware of this. I think it shows that new leaks were
not introduced with the patch series.

BEFORE patch series:
==1572278==
-fself-test: 7634593 pass(es) in 22.799240 seconds
==1572278==
==1572278== HEAP SUMMARY:
==1572278== in use at exit: 1,083,255 bytes in 2,394 blocks
==1572278==   total heap usage: 2,704,869 allocs, 2,702,475 frees, 
1,257,334,536 bytes allocated
==1572278==
==1572278== 8,032 bytes in 1 blocks are possibly lost in loss record 639 of 657
==1572278==at 0x4848899: malloc (in 
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1572278==by 0x21FE1CB: xmalloc (xmalloc.c:149)
==1572278==by 0x21B02E0: new_buff (lex.cc:4767)
==1572278==by 0x21B02E0: _cpp_get_buff (lex.cc:4800)
==1572278==by 0x21ACC80: cpp_create_reader(c_lang, ht*, line_maps*) 
(init.cc:289)
==1572278==by 0xA64282: c_common_init_options(unsigned int, 
cl_decoded_option*) (c-opts.cc:237)
==1572278==by 0x95E479: toplev::main(int, char**) (toplev.cc:2241)
==1572278==by 0x960B2D: main (main.cc:39)
==1572278==
==1572278== LEAK SUMMARY:
==1572278==definitely lost: 0 bytes in 0 blocks
==1572278==indirectly lost: 0 bytes in 0 blocks
==1572278==  possibly lost: 8,032 bytes in 1 blocks
==1572278==still reachable: 1,075,223 bytes in 2,393 blocks
==1572278== suppressed: 0 bytes in 0 blocks
==1572278== Reachable blocks (those to which a pointer was found) are not shown.
==1572278== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1572278==
==1572278== For lists of detected and suppressed errors, rerun with: -s
==1572278== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

AFTER patch series:
==1594840==
-fself-test: 7638403 pass(es) in 23.671784 seconds
==1594840==
==1594840== HEAP SUMMARY:
==1594840== in use at exit: 1,081,759 bytes in 2,367 blocks
==1594840==   total heap usage: 2,728,561 

Re: [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-15 Thread David Malcolm via Gcc-patches
On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> Class file_cache_slot in input.cc is used to query specific lines of source
> code from a file when needed by diagnostics infrastructure. This will be
> extended in a subsequent patch to support obtaining the source code from
> in-memory generated buffers rather than from a file. The present patch
> refactors class file_cache_slot, putting most of the logic into a new base
> class cache_data_source, in preparation for reusing that code in the next
> patch. There is no change in functionality yet.
> 
> gcc/ChangeLog:
> 
> * input.cc (class file_cache_slot): Refactor functionality into a
> new base class...
> (class cache_data_source): ...here.
> (file_cache::forcibly_evict_file): Adapt for refactoring.
> (file_cache_slot::evict): Renamed to...
> (file_cache_slot::reset): ...this, and partially refactored into
> base class...
> (cache_data_source::reset): ...here.
> (file_cache_slot::get_full_file_content): Moved into base class...
> (cache_data_source::get_full_file_content): ...here.
> (file_cache_slot::create): Adapt for refactoring.
> (file_cache_slot::file_cache_slot): Refactor partially into...
> (cache_data_source::cache_data_source): ...here.
> (file_cache_slot::~file_cache_slot): Refactor partially into...
> (cache_data_source::~cache_data_source): ...here.
> (file_cache_slot::needs_read_p): Remove.
> (file_cache_slot::needs_grow_p): Remove.
> (file_cache_slot::maybe_grow): Adapt for refactoring.
> (file_cache_slot::read_data): Refactored, along with...
> (file_cache_slot::maybe_read_data): this, into...
> (file_cache_slot::get_more_data): ...here.
> (find_end_of_line): Change interface to take a pair of pointers,
> rather than a pointer + length.
> (file_cache_slot::get_next_line): Refactored into...
> (cache_data_source::get_next_line): ...here.
> (file_cache_slot::goto_next_line): Refactored into...
> (cache_data_source::goto_next_line): ...here.
> (file_cache_slot::read_line_num): Refactored into...
> (cache_data_source::read_line_num): ...here.
> (location_get_source_line): Fix const-correctness as necessitated by
> new interface.
> ---
>  gcc/input.cc | 513 +++
>  1 file changed, 235 insertions(+), 278 deletions(-)
> 

I confess I had to reread both this and patch 4/8 to make sense of
this; this is probably one of those cases where it's harder to read in
patch form than as source, but I think I now understand the new
implementation.

Did you try testing this with valgrind (e.g. "make selftest-valgrind")?

I don't think we have any selftest coverage for "\r" in the line-break
handling; that would be good to add.

This patch is OK for trunk once the rest of the kit is approved.

Thanks
Dave