I posted this on the LLVM Discourse forum[1] and got some traction, so
I want to get the GCC community's input. (My initial proposal is
replicated here.)

I had already mentioned this in previous emails in this thread, so
it's nothing super new, and there have been some suggested
improvements already. Parts of this reference a meeting that took
place between the LLVM developers and some non-LLVM developers. The
meeting mostly explained the issues regarding the "compromise" from
this thread and how it interacts (poorly) with C++, and vice versa.

There was a lengthy discussion after this proposal.

Please take a look and let me know what you think.

-bw

[1] 
https://discourse.llvm.org/t/rfc-bounds-safety-in-c-syntax-compatibility-with-gcc/85885/32?u=void

--------------------------------------------------

I’ve been putting off pushing this proposal, because it is a departure
from what Apple has done and added a lot of extra syntax for this
feature, but I think it’s appropriate right now.

The main issue at play is that C and C++ are two very different
languages. The scoping rules are completely different making name
resolution not work in one language without jumping through
non-obvious hoops. This was made clear in @rapidsna’s presentation
last week. Making matters worse is that GCC (and other) compilers
perform one pass parsing for C, making forward declarations necessary.
The forward declarations, while solving many issues, have their own
issues. Other solutions at play require changes to the base languages,
which require approval by the standards committee.

Even if the full struct was declared before the expression in the
attribute was defined, there would still be issues, due to one example
from @rapidsna’s presentation [as pointed out by Joseph Jelinek]:

typedef int T;
struct foo {
  int T;
  int U;
  int * __counted_by_expr(int T; (T)+U) buf; // Addition or cast?
};

Given this, I want to propose using functions / static methods for expressions.

The function takes one and only one argument: a "this" pointer to the
least enclosing non-anonymous struct.

The call to the function is generated by the compiler, so no argument
the attribute only needs to indicate the function’s name. This avoids
the need to add a new __builtin_* or __self element to C.

* The function needs to be declared before use in C. (It can be fully
defined if no fields within the struct are used.)
* The function should be static and marked as pure (and maybe always_inline).
* The function in C++ should be private or protected.

C example:

static size_t calc_counted_by(void *);
struct foo {
  /* ... */
  struct bar {
    int * __counted_by_expr(calc_counted_by) buf;
    int count;
    int scale;
  };
};

enum { OFFSET = 42 };

// The function could be marked with the 'pure' attribute.
static size_t __pure calc_counted_by(void *p) {
  struct bar *ptr = (struct foo *)p;
  return ptr->count * ptr->scale - OFFSET;
}

C++ example:

struct foo {
  enum { OFFSET = 42 };
  struct bar {
    int * __counted_by_expr(calc_counted_by) buf;
  private:
    static size_t __pure calc_counted_by(struct bar *ptr) {
      return ptr->count * ptr->scale - OFFSET;
    }
  public:
    int count;
    int scale;
  };
};

Pros

1. This uses the current language without any modifications to scoping
or requiring feature additions that need to be approved by the
standards committee. All compilers should be able to implement them
without major modifications.
2. Name lookup is no longer a problem, so there isn’t a need for
forward declarations or trying to determine which scope to use in
various circumstances.
3. In the general case where the full struct is pass into the
calculating function, both C and C++ parse the code in the same way.
In the C example above, it would need to be modified to this:

static size_t __pure calc_counted_by(void *p) {
#ifdef __cplusplus
  foo::bar *ptr = static_cast<foo::bar *>(p);
#else
  struct bar *ptr = (struct bar *)p;
#endif
  return ptr->count * ptr->scale - OFFSET;
}

This format can be extended to other languages if need be.

Cons

1 It’s wordy, which may make it unappealing to users.
2 The #ifdef __cplusplus ... #endif usage above is wordy and a bit awkward.
3 Importantly, it’s harder for Apple’s bounds safety work to analyze
the fields used within the expression.
4. Apple and their users already use the current syntax.

For (1), that’s an unfortunate outcome of this feature. There may be
ways to reduce the amount of code that needs to be written, but the
above is a good starting place.

[Note: Kees came up with a way to avoid the forward declaration of the
function---have the compiler generate the forward declaration with a
set declaration syntax: e.g. static __pure size_t
size_calculation(struct foo *);]

For (2), the rule about using the least enclosing non-anonymous struct
could be loosened and the whole struct passed in. The user has full
control over which fields to use.

For (3), it’s harder to get the expression because it’s within the
function, but that function is available in the AST, so getting its
contents shouldn’t be impossible. (I don’t mean to shrug off this
concern as I haven’t seen the code. If I’m completely off base here
please tell me.)

For (4), this is a large sticking point. There are two options that I
can think of:

1. Allow Apple users to keep the current syntax, because Apple’s
platform doesn’t support GCC, and/or
2. Use clang-tidy to convert the old syntax to the new syntax.

I don’t think either option is better than the other, though (1) does
involve supporting two different code paths for the same feature.

In conclusion

My overriding concern from the beginning is that both GCC and Clang
end up with the same (or similar) syntax for these features so that it
can be applied equally to Linux (and one assumes other projects). None
of the suggested syntaxes or solutions presented so far satisfy all
requirements.

Usage of a function to calculate the size uses the base language
features, doesn’t require changing any language, doesn’t require
support from a standards committee, and can be supported by both
compilers (I even have a branch that implements a simplified version
for Clang).

Share and enjoy!
-bw

Reply via email to