On Wed, Jun 25, 2025 at 1:37 PM Bill Wendling <isanb...@gmail.com> wrote: > > I posted this on the LLVM Discourse forum[1] and got some traction, so > I want to get the GCC community's input. (My initial proposal is > replicated here.) > > I had already mentioned this in previous emails in this thread, so > it's nothing super new, and there have been some suggested > improvements already. Parts of this reference a meeting that took > place between the LLVM developers and some non-LLVM developers. The > meeting mostly explained the issues regarding the "compromise" from > this thread and how it interacts (poorly) with C++, and vice versa. > > There was a lengthy discussion after this proposal. > > Please take a look and let me know what you think. > There are a couple of notes to add to this proposal:
1. The suggestion to omit the forward decl of the function is more of a nicety rather than a requirement. I could be overstating how much it would annoy a programmer to have to write: struct b; static __pure size_t calc(struct b *); struct b { int *buf __counted_by(calc); int count; }; static __pure size_t calc(struct b *p) { // do something. } rather than have the compiler do the forward decls for you. 2. There was another suggestion on the mailing list to add the attribute after the struct definition: enum { OFFSET = 42 }; struct foo { int count; int *buf; } __counted_by(count - OFFSET, buf); It has some merit. The downside is that it loses locality. -bw > -bw > > [1] > https://discourse.llvm.org/t/rfc-bounds-safety-in-c-syntax-compatibility-with-gcc/85885/32?u=void > > -------------------------------------------------- > > I’ve been putting off pushing this proposal, because it is a departure > from what Apple has done and added a lot of extra syntax for this > feature, but I think it’s appropriate right now. > > The main issue at play is that C and C++ are two very different > languages. The scoping rules are completely different making name > resolution not work in one language without jumping through > non-obvious hoops. This was made clear in @rapidsna’s presentation > last week. Making matters worse is that GCC (and other) compilers > perform one pass parsing for C, making forward declarations necessary. > The forward declarations, while solving many issues, have their own > issues. Other solutions at play require changes to the base languages, > which require approval by the standards committee. > > Even if the full struct was declared before the expression in the > attribute was defined, there would still be issues, due to one example > from @rapidsna’s presentation [as pointed out by Joseph Jelinek]: > > typedef int T; > struct foo { > int T; > int U; > int * __counted_by_expr(int T; (T)+U) buf; // Addition or cast? > }; > > Given this, I want to propose using functions / static methods for > expressions. > > The function takes one and only one argument: a "this" pointer to the > least enclosing non-anonymous struct. > > The call to the function is generated by the compiler, so no argument > the attribute only needs to indicate the function’s name. This avoids > the need to add a new __builtin_* or __self element to C. > > * The function needs to be declared before use in C. (It can be fully > defined if no fields within the struct are used.) > * The function should be static and marked as pure (and maybe always_inline). > * The function in C++ should be private or protected. > > C example: > > static size_t calc_counted_by(void *); > struct foo { > /* ... */ > struct bar { > int * __counted_by_expr(calc_counted_by) buf; > int count; > int scale; > }; > }; > > enum { OFFSET = 42 }; > > // The function could be marked with the 'pure' attribute. > static size_t __pure calc_counted_by(void *p) { > struct bar *ptr = (struct foo *)p; > return ptr->count * ptr->scale - OFFSET; > } > > C++ example: > > struct foo { > enum { OFFSET = 42 }; > struct bar { > int * __counted_by_expr(calc_counted_by) buf; > private: > static size_t __pure calc_counted_by(struct bar *ptr) { > return ptr->count * ptr->scale - OFFSET; > } > public: > int count; > int scale; > }; > }; > > Pros > > 1. This uses the current language without any modifications to scoping > or requiring feature additions that need to be approved by the > standards committee. All compilers should be able to implement them > without major modifications. > 2. Name lookup is no longer a problem, so there isn’t a need for > forward declarations or trying to determine which scope to use in > various circumstances. > 3. In the general case where the full struct is pass into the > calculating function, both C and C++ parse the code in the same way. > In the C example above, it would need to be modified to this: > > static size_t __pure calc_counted_by(void *p) { > #ifdef __cplusplus > foo::bar *ptr = static_cast<foo::bar *>(p); > #else > struct bar *ptr = (struct bar *)p; > #endif > return ptr->count * ptr->scale - OFFSET; > } > > This format can be extended to other languages if need be. > > Cons > > 1 It’s wordy, which may make it unappealing to users. > 2 The #ifdef __cplusplus ... #endif usage above is wordy and a bit awkward. > 3 Importantly, it’s harder for Apple’s bounds safety work to analyze > the fields used within the expression. > 4. Apple and their users already use the current syntax. > > For (1), that’s an unfortunate outcome of this feature. There may be > ways to reduce the amount of code that needs to be written, but the > above is a good starting place. > > [Note: Kees came up with a way to avoid the forward declaration of the > function---have the compiler generate the forward declaration with a > set declaration syntax: e.g. static __pure size_t > size_calculation(struct foo *);] > > For (2), the rule about using the least enclosing non-anonymous struct > could be loosened and the whole struct passed in. The user has full > control over which fields to use. > > For (3), it’s harder to get the expression because it’s within the > function, but that function is available in the AST, so getting its > contents shouldn’t be impossible. (I don’t mean to shrug off this > concern as I haven’t seen the code. If I’m completely off base here > please tell me.) > > For (4), this is a large sticking point. There are two options that I > can think of: > > 1. Allow Apple users to keep the current syntax, because Apple’s > platform doesn’t support GCC, and/or > 2. Use clang-tidy to convert the old syntax to the new syntax. > > I don’t think either option is better than the other, though (1) does > involve supporting two different code paths for the same feature. > > In conclusion > > My overriding concern from the beginning is that both GCC and Clang > end up with the same (or similar) syntax for these features so that it > can be applied equally to Linux (and one assumes other projects). None > of the suggested syntaxes or solutions presented so far satisfy all > requirements. > > Usage of a function to calculate the size uses the base language > features, doesn’t require changing any language, doesn’t require > support from a standards committee, and can be supported by both > compilers (I even have a branch that implements a simplified version > for Clang). > > Share and enjoy! > -bw