Re: [PATCH v2] bpf: add preserve_field_info builtin

Jose E. Marchesi via Gcc-patches Tue, 25 Oct 2022 12:07:42 -0700


> Hi Jose,
>
> Thanks for your comments. I think I've addressed them all in the updated
> patch below.
>
>>>+  get_inner_reference (src, &bitsize, &bitpos, &var_off, &mode, &unsignedp,
>>>+                   &reversep, &volatilep);
>>
>>Since the information returned by the builtin is always constant
>>(positions, sizes) I think you will want to adjust the code for the
>>eventuality of variable positioned fields and also variable sized
>>fields.
>>
>>get_inner_reference sets var_off to a tree if the position of the field
>>is variable.  In these cases `bitpos' is relative to that position.
>>
>>Likewise, get_inner_reference sets `mode' is set to BLKmode and
>>`bitsize' will be set to -1.
>>
>>I'm not sure what the built-in is supposed to do/return in these cases.
>>I guess it makes sense to error out, but what does LLVM do?
>
> I would have thought erroring out the only option, but it seems that
> LLVM will return a value from the builtin and record a CO-RE relocation
> as normal.
>
> What value will be returned depends of course on KIND, but from what
> I can tell it seems that such fields are treated as having an offset of
> 0 bits and/or a size of 0 bits. For example FIELD_BYTE_SIZE for a
> flexible-length array will return 0. FIELD_RSHIFT_U64 will be
> calculated as 64 - 0 = 64.
>
> This sort of makes sense if you expect that any BPF loader will honor
> the CO-RE relocations and patch the return value before the program is
> run, i.e. the actual values at compile time are irrelevant.
>
> But, I'm not sure that BPF loaders in practice actually _can_ patch the
> return value correctly. The source of information for resolving the
> relocations is the BTF. But the BTF won't have more information about
> variable position/size members. A flexible-length array for example in
> BTF is represented as an array type with 0 elements. So the size
> calculation when patching the relocation (looking at the impl in
> libbpf) will be elem_size * nelems = 0, and the 'patched' values will
> be the same as the unpatched.
>
> I'm not sure whether this behavior is a known limitation or an
> oversight. In my opinion it makes more sense to error at compile time,
> becuase even after the loader patches the return value it still will
> not be correct for these cases.
>
> So for now I've set these cases to error out, but it would be just as
> simple to mimic the LLVM behavior. WDYT?


I would say it makes more sense to error out than to return invalid
data.

However, the divergence wrt LLVM is a concern.  What about keeping this
behavior in the GCC backend and simultaneously raise the issue in
bpf@vger?  If it was a design oversight and the change doesn't impact
kernel sources, they may be willing to change it.

>>If I read this properly, for something like:
>>
>>__builtin_preserve_field_info (a = foo.bar + bar.baz, KIND)
>>
>>On one side CO-RE relocations are computed for both foo.bar and bar.baz
>>(I see bpf_core_compute does that) as expected.
>>
>>But then the builtin returns information that can only apply to one
>>access.  Which one?
>
> Expressions like this should not be accepted by the builtin. I didn't
> consider this case in v1 so it led to an ICE. Clang rejects this
> outright and errors with "argument 1 is not a field access". It is
> actually very strict about the expressions that are accepted, unlike
> __builtin_preserve_access_index.
>
> I have updated this implementation to behave more like clang in that
> it will reject any expression that isn't directly a field access. That
> even includes rejecting things like:
>
>   __builtin_preserve_field_info (&foo.bar, KIND)
>
> Since unlike preserve_access_index this builtin does not actually
> perform the operation in EXPR, it makes sense to enforce that EXPR must
> be exactly a single field access.

Ok, thanks.

> [...]
> +@deftypefn {Built-in Function} unsigned int __builtin_preserve_field_info 
> (@var{expr}, unsigned int @var{kind})
> +BPF Compile Once-Run Everywhere (CO-RE) support. This builtin is used to
> +extract information to aid in struct/union relocations.  @var{expr} is
> +an access to a field of a struct or union. Depending on @var{kind}, different
> +information is returned to the program. A CO-RE relocation for the access in
> +@var{expr} with kind @var{kind} is recorded if @code{-mco-re} is in effect.
> +
> +The following values are supported for @var{kind}:
> +@table @var
> +@item FIELD_BYTE_OFFSET = 0
> +The returned value is the offset, in bytes, of the field from the
> +beginning of the containing structure.

What about bit fields?  Is this the byte offset of the containing word?

> +@item FIELD_BYTE_SIZE = 1
> +The returned value is the size, in bytes, of the field.

For bit fields,  is this the size of the containing word?

> +@item FIELD_EXISTENCE = 2
> +The returned value is 1 if the field exists, 0 otherwise. Always 1 at
> +compile time.
> +
> +@item FIELD_SIGNEDNESS = 3
> +The returned value is 1 if the field is signed, 0 otherwise.
> +
> +@item FIELD_LSHIFT_U64 = 4
> +@itemx FIELD_RSHIFT_U64 = 5
> +Suppose the field were loaded into a value of FIELD_BYTE_SIZE bytes
> +and then zero or sign-extended to a 64-bit value. The returned value
> +is the number of bits of left or right shifting respectively that
> +would be needed to recover the original value of the field.

What are the semantics for bit fields?

> +@end table
> +
> +Note that the return value is a constant which is known at
> +compile-time. If the field has a variable offset then
> +FIELD_BYTE_OFFSET, FIELD_LSHIFT_U64 and FIELD_RSHIFT_U64 are not
> +supported. Similarly, if the field has a variable size then
> +FIELD_BYTE_SIZE, FIELD_LSHIFT_U64 and FIELD_RSHIFT_U64 are not
> +supported.
> +
> +@end deftypefn
> +
>  @node FR-V Built-in Functions
>  @subsection FR-V Built-in Functions

Re: [PATCH v2] bpf: add preserve_field_info builtin

Reply via email to