Hi! On 2023-07-04T15:56:23-0400, Lewis Hyatt via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > On Tue, Jul 4, 2023 at 11:50 AM Thomas Schwinge <tho...@codesourcery.com> > wrote: >> I came across this one here on my way working through another (somewhat >> related) GTY issue. I generally do understand the issue here, but do >> have a question about 'unsigned int len' field in >> 'libcpp/include/symtab.h:struct ht_identifier': [...]
> I don't think there is currently any possibility for a null byte to > end up in an ht_identifier's string. I assumed that ht_identifier > stores the length as an optimization (especially since it doesn't take > up any extra space on 64-bit platforms, given the 32-bit hash code is > stored as well there.) I created the string_length GTY markup mainly > to support another patch that I have still pending review, which I > thought would increase the likelihood of PCH needing to handle null > bytes in general. When I did that, I added the markup to ht_identifier > simply because the length was already there, so there was no reason > not to add it. It does save a few cycles when streaming out the PCH, > but I doubt it is meaningful. Thanks for confirming. OK thus to push the attached "GTY: Enhance 'string_length' option documentation"? Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
>From a31b6657c26ac70c6e03b8ad81cdcb873f905716 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Wed, 5 Jul 2023 08:38:49 +0200 Subject: [PATCH] GTY: Enhance 'string_length' option documentation We're (currently) not aware of any actual use of 'ht_identifier's with NUL characters embedded; its 'len' field appears to exist for optimization purposes, since "forever". Before 'struct ht_identifier' was added in commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we had in 'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or earlier 'length', earlier in 'gcc/cpphash.h:struct hashnode': 'unsigned short length', earlier 'size_t length' with comment: "length of token, for quick comparison", earlier 'int length', ever since the 'gcc/cpp*' files were added in commit 7f2935c734c36f84ab62b20a04de465e19061333 (Subversion r9191). This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a "pch: Fix streaming of strings with embedded null bytes". gcc/ * doc/gty.texi (GTY Options) <string_length>: Enhance. libcpp/ * include/symtab.h (struct ht_identifier): Document different rationale. --- gcc/doc/gty.texi | 11 +++++++++++ libcpp/include/symtab.h | 4 +--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi index 7bd064b5781..15f9fa07405 100644 --- a/gcc/doc/gty.texi +++ b/gcc/doc/gty.texi @@ -217,6 +217,17 @@ struct GTY(()) non_terminated_string @{ @}; @end smallexample +Similarly, this is useful for (regular NUL-terminated) strings with +NUL characters embedded (that the default @code{strlen} use would run +afoul of): + +@smallexample +struct GTY(()) multi_string @{ + const char * GTY((string_length ("%h.len + 1"))) str; + size_t len; +@}; +@end smallexample + The @code{string_length} option currently is not supported for (fields in) global variables. @c <https://inbox.sourceware.org/87bkgqvlst....@euler.schwinge.homeip.net> diff --git a/libcpp/include/symtab.h b/libcpp/include/symtab.h index c7ccc6db9f0..0c713f2ad30 100644 --- a/libcpp/include/symtab.h +++ b/libcpp/include/symtab.h @@ -29,9 +29,7 @@ along with this program; see the file COPYING3. If not see typedef struct ht_identifier ht_identifier; typedef struct ht_identifier *ht_identifier_ptr; struct GTY(()) ht_identifier { - /* This GTY markup arranges that the null-terminated identifier would still - stream to PCH correctly, if a null byte were to make its way into an - identifier somehow. */ + /* We know the 'len'gth of the 'str'ing; use it in the GTY markup. */ const unsigned char * GTY((string_length ("1 + %h.len"))) str; unsigned int len; unsigned int hash_value; -- 2.34.1