Re: [PATCH v2][makedumpfile 08/14] Implement kernel btf resolving

Tao Liu Tue, 23 Dec 2025 14:23:01 -0800

Hi Kazu,

On Tue, Dec 23, 2025 at 8:38 PM HAGIO KAZUHITO(萩尾　一仁)
<[email protected]> wrote:
>
> On 2025/12/03 9:52, Tao Liu wrote:
> > Hi Stephen,
> >
> > Thanks for your comments!
> >
> > On Wed, Dec 3, 2025 at 11:00 AM Stephen Brennan
> > <[email protected]> wrote:
> >>
> >> Tao Liu <[email protected]> writes:
> >>> This patch will parse kernel's btf data. The data can be located via
> >>> __start_BTF and __stop_BTF symbols which have been resolved by kallsyms of
> >>> the previous patch.
> >>>
> >>> The btf data is organized as follows: each one of kernel modules, 
> >>> including
> >>> vmlinux itself, will have a btf_file struct describing its btf data, and
> >>> holding an array of struct name_entry. By given the btf type id, we can 
> >>> resolve
> >>> its name_entry fast by array index. In addition, name_entry can also be
> >>> organized in hash table. So given a type name, we can also resolve its
> >>> name_entry in a fast speed. In other words, both btf type id and btf
> >>> type name can we get it resolved fast. Once we get the name_entry 
> >>> structure of
> >>> the btf type, we can resolve its member/size etc easily.
> >>>
> >>> Since all name_entry array starting from index 0, which cannot identify a 
> >>> btf
> >>> type globally. So a uniq id is used, its value is accumulated by the 
> >>> location
> >>> of the btf_file within btf_file_array and total quantity of btf types 
> >>> within
> >>> each btf_file.
> >>>
> >>> Signed-off-by: Tao Liu <[email protected]>
> >>
> >> Hi Tao,
> >>
> >> I haven't read this patch in detail, but I wanted to ask if you had any
> >> particular reason to avoid using libbpf for parsing the BTF?
> >
> > Actually I manually parse BTF data due to the following concerns:
> >
> > 1) Learn BTF data structure by manually parsing. I haven't worked on
> > BTF previously, and I prefer the way of doing things from scratch in
> > order to learn as fast and as deeply. I didn't do any project prior to
> > this one relating to BTF. So I started manual parsing and kept it in
> > the  patchset.
> >
> > 2) Let everything be under control. Before actually working on this, I
> > know that the BTF/kallsyms parsing will be running in 2nd kernel, so
> > it should have a good balance of 1) as fast, 2) less memory
> > consumption. So I created 2 ways for BTF resolving: 1) Given id of btf
> > type, resolve it by array index(some meta data of BTF types are
> > organized in array, so easy to resolve the corresponding BTF type by
> > the meta data); 2) Given name of btf type, resolve it by hash table.
> > Like I said, only necessary data is stored within memory, others are
> > left as disk files to read when needed. I'm not saying I did a better
> > job than libbpf, my approach is more white-box to me, and easy to
> > improve.
> >
> > 3) I'm happy to improve the patchset, if libbpf is better in 2nd
> > kernel + eppic case, I can replace all those into libbpf APIs.
> >
> >>
> >> For my implementation of BTF parsing for drgn I was not aware of it, so
> >> I did the parsing entirely manually, similar to this. But for my recent
> >> attempt[1] at adding BTF support to makedumpfile (unaware of this patch), I
> >> used libbpf and found that it did reduce the amount of required code by
> >> a lot. These are specific functions I found useful:
> >>
> >> - btf__new() -> returns a "struct btf *" for use with the library
> >> - btf__type_by_id() -> returns the "struct btf_type *". Then, the
> >>    linux/btf.h header provides several inline functions to work with it:
> >>    - btf_members(), btf_vlen()
> >>    - btf_member_bit_offset()
> >> - btf__pointer_size()
> >> - btf__name_by_offset() -> returns the name for a string offset
> >> - btf__find_by_name() & btf__find_by_name_kind() -> looks up a type by
> >>    name. Note that this does a linear search, so your hash table approach
> >>    is better for random lookups.
> >> - btf__resolve_size()
> >>
> >> So with libbpf, the only thing you'd really need to implement is
> >> resolving member offsets (supporting anonyomus structs, recursively,
> >> etc), and a hash table for string lookups, if you feel it's necessary.
> >> For module BTF, btf__new_split() is available as well.
> >
> > Thanks for providing the info, I can give it a try and do a
> > performance/memory consumption measurement. In the meantime, I'm keen
>
> Thank you for the information and thoughts.
>
> Considering its maintenance, I think it would be better to use the
> library if possible.  It's hard for others than the auther to maintain
> scratch code and follow the changes of BTF.  If you don't want to use
> the library, please give us performance/memory consumption data or
> something to convince that the library cannot be used.
>


Thanks for your comments. I will give the library a try and see which
approach has better performance/memory balance in v3.

Thanks,
Tao Liu


> Thanks,
> Kazu
>
> > for any suggestions on the whole BTF/kallsyms + eppic approach for
> > makedumpfile customization. I guess there are cases other than GPU mm
> > filtering which are demanding for extending the current page-flag
> > filtering of makedumpfile.
> >
> > Thanks,
> > Tao Liu
> >
> >>
> >> Thanks,
> >> Stephen
> >>
> >> [1]: 
> >> https://github.com/brenns10/makedumpfile/commit/98129399a3a10ae72408bc4aaec2485f7d220626#diff-e1e6bd59ae956df7d2d42ad3580bd3c04da533f1b90827a04e2cc27bbf24b2a7
> >>

Re: [PATCH v2][makedumpfile 08/14] Implement kernel btf resolving

Reply via email to