On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi <jose.march...@oracle.com> wrote: > > > [Added Eduard Zingerman in CC, who is implementing this same feature in > clang/llvm and also the consumer component in the kernel (pahole).] > > Hi Richard. > > > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> Hello, > >> > >> This series adds support for a new attribute, "btf_decl_tag" in GCC. > >> The same attribute is already supported in clang, and is used by various > >> components of the BPF ecosystem. > >> > >> The purpose of the attribute is to allow to associate (to "tag") > >> declarations with arbitrary string annotations, which are emitted into > >> debugging information (DWARF and/or BTF) to facilitate post-compilation > >> analysis (the motivating use case being the Linux kernel BPF verifier). > >> Multiple tags are allowed on the same declaration. > >> > >> These strings are not interpreted by the compiler, and the attribute > >> itself has no effect on generated code, other than to produce additional > >> DWARF DIEs and/or BTF records conveying the annotations. > >> > >> This entails: > >> > >> - A new C-language-level attribute which allows to associate (to "tag") > >> particular declarations with arbitrary strings. > >> > >> - The conveyance of that information in DWARF in the form of a new DIE, > >> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > >> that of the DW_TAG_LLVM_annotation extension supported in LLVM for > >> the same purpose. These DIEs are already supported by BPF tooling, > >> such as pahole. > >> > >> - The conveyance of that information in BTF debug info in the form of > >> BTF_KIND_DECL_TAG records. These records are already supported by > >> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > >> eBPF verifier. > >> > >> > >> Background > >> ========== > >> > >> The purpose of these tags is to convey additional semantic information > >> to post-compilation consumers, in particular the Linux kernel eBPF > >> verifier. The verifier can make use of that information while analyzing > >> a BPF program to aid in determining whether to allow or reject the > >> program to be run. More background on these tags can be found in the > >> early support for them in the kernel here [1] and [2]. > >> > >> The "btf_decl_tag" attribute is half the story; the other half is a > >> sibling attribute "btf_type_tag" which serves the same purpose but > >> applies to types. Support for btf_type_tag will come in a separate > >> patch series, since it is impaced by GCC bug 110439 which needs to be > >> addressed first. > >> > >> I submitted an initial version of this work (including btf_type_tag) > >> last spring [3], however at the time there were some open questions > >> about the behavior of the btf_type_tag attribute and issues with its > >> implementation. Since then we have clarified these details and agreed > >> to solutions with the BPF community and LLVM BPF folks. > >> > >> The main motivation for emitting the tags in DWARF is that the Linux > >> kernel generates its BTF information via pahole, using DWARF as a source: > >> > >> +--------+ BTF BTF +----------+ > >> | pahole |-------> vmlinux.btf ------->| verifier | > >> +--------+ +----------+ > >> ^ ^ > >> | | > >> DWARF | BTF | > >> | | > >> vmlinux +-------------+ > >> module1.ko | BPF program | > >> module2.ko +-------------+ > >> ... > >> > >> This is because: > >> > >> a) pahole adds additional kernel-specific information into the > >> produced BTF based on additional analysis of kernel objects. > >> > >> b) Unlike GCC, LLVM will only generate BTF for BPF programs. > >> > >> b) GCC can generate BTF for whatever target with -gbtf, but there is no > >> support for linking/deduplicating BTF in the linker. > >> > >> In the scenario above, the verifier needs access to the pointer tags of > >> both the kernel types/declarations (conveyed in the DWARF and translated > >> to BTF by pahole) and those of the BPF program (available directly in BTF). > >> > >> > >> DWARF Representation > >> ==================== > >> > >> As noted above, btf_decl_tag is represented in DWARF via a new DIE > >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > >> the following format: > >> > >> DW_TAG_GNU_annotation (0x6000) > >> DW_AT_name: "btf_decl_tag" > >> DW_AT_const_value: <string argument> > >> > >> These DIEs are placed in the DWARF tree as children of the DIE for the > >> appropriate declaration, and one such DIE is created for each occurrence > >> of the btf_decl_tag attribute on a declaration. > >> > >> For example: > >> > >> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag > >> ("devicemem"))); > >> > >> This declaration produces the following DWARF: > >> > >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > >> <1f> DW_AT_name : c > >> <24> DW_AT_type : <0x49> > >> ... > >> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > >> <37> DW_AT_name : (indirect string, offset: 0x4c): > >> btf_decl_tag > >> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > >> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > >> <40> DW_AT_name : (indirect string, offset: 0x4c): > >> btf_decl_tag > >> <44> DW_AT_const_value : __c > >> <2><48>: Abbrev Number: 0 > >> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > >> ... > >> > >> The DIEs for btf_decl_tag are placed as children of the DIE for > >> variable "c". > > > > It looks like a bit of overkill, and inefficient as well. Why's the > > tags not referenced via the existing DW_AT_description? > > The DWARF spec ("Entity Descriptions") seems to imply that the > DW_AT_description attribute is intended to be used to hold alternative > ways to denote the same "debugging information" (object, type, ...), > i.e. alternative aliases to refer to the same entity than the > DW_AT_name. For example, for a type name='foo' we could have > description='aka. long int'. We don't think this is the case of the btf > tags, which are more like properties partially characterizing the tagged > "debugging information", but couldn't be used as an alias to the name. > > Also, repurposing the DW_AT_description attribute to hold btf tag > information would require to introduce a mini-language and subsequent > parsing by the clients: how to denote several tags, how to encode the > embedded string contents, etc. You kick the complexity out the door and > it comes back in through the window :) > > Finally, for what we know, the existing attribute may already be used by > some language and handled by some debugger the way it is recommended in > the spec. That would be incompatible with having btf tags encoded > there.
How are the C/C++ standard attributes proposed to be encoded in dwarf? I think adding special encoding just for BTF tags looks wrong. > > Iff you want new TAGs why require them as children for each DIE rather > > than referencing (and sharing!) them via a DIE reference from a new > > attribute? > > Hmm, thats a very good question. The Linux kernel sources uses both > declaration tags and type tags and not sharing the DIEs may result in > serious bloating, since the tags are brought in to declarations and type > specifiers via macros... > > > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. > > > > But well ... > > > > Richard. > > > >> > >> BTF Representation > >> ================== > >> > >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records > >> refer > >> to the annotated object by BTF type ID, as well as a component index which > >> is > >> used for btf_decl_tags placed on struct/union members or function > >> arguments. > >> > >> For example, the BTF for the above declaration is: > >> > >> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > >> [2] CONST '(anon)' type_id=1 > >> [3] PTR '(anon)' type_id=2 > >> [4] DECL_TAG '__c' type_id=6 component_idx=-1 > >> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 > >> [6] VAR 'c' type_id=3, linkage=global > >> ... > >> > >> The BTF format is documented here [4]. > >> > >> > >> References > >> ========== > >> > >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/ > >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-...@fb.com/ > >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > >> > >> > >> David Faust (9): > >> c-family: add btf_decl_tag attribute > >> include: add BTF decl tag defines > >> dwarf: create annotation DIEs for decl tags > >> dwarf: expose get_die_parent > >> ctf: add support to pass through BTF tags > >> dwarf2ctf: convert annotation DIEs to CTF types > >> btf: create and output BTF_KIND_DECL_TAG types > >> testsuite: add tests for BTF decl tags > >> doc: document btf_decl_tag attribute > >> > >> gcc/btfout.cc | 81 ++++++++++++++++++- > >> gcc/c-family/c-attribs.cc | 23 ++++++ > >> gcc/ctf-int.h | 28 +++++++ > >> gcc/ctfc.cc | 10 ++- > >> gcc/ctfc.h | 17 +++- > >> gcc/doc/extend.texi | 47 +++++++++++ > >> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > >> gcc/dwarf2out.cc | 37 ++++++++- > >> gcc/dwarf2out.h | 1 + > >> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > >> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > >> include/btf.h | 14 +++- > >> include/dwarf2.def | 4 + > >> 17 files changed, 437 insertions(+), 15 deletions(-) > >> create mode 100644 gcc/ctf-int.h > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c > >> > >> -- > >> 2.40.1 > >>