Hi Kazu, On Tue, Dec 23, 2025 at 8:38 PM HAGIO KAZUHITO(萩尾 一仁) <[email protected]> wrote: > > On 2025/12/03 9:52, Tao Liu wrote: > > Hi Stephen, > > > > Thanks for your comments! > > > > On Wed, Dec 3, 2025 at 11:00 AM Stephen Brennan > > <[email protected]> wrote: > >> > >> Tao Liu <[email protected]> writes: > >>> This patch will parse kernel's btf data. The data can be located via > >>> __start_BTF and __stop_BTF symbols which have been resolved by kallsyms of > >>> the previous patch. > >>> > >>> The btf data is organized as follows: each one of kernel modules, > >>> including > >>> vmlinux itself, will have a btf_file struct describing its btf data, and > >>> holding an array of struct name_entry. By given the btf type id, we can > >>> resolve > >>> its name_entry fast by array index. In addition, name_entry can also be > >>> organized in hash table. So given a type name, we can also resolve its > >>> name_entry in a fast speed. In other words, both btf type id and btf > >>> type name can we get it resolved fast. Once we get the name_entry > >>> structure of > >>> the btf type, we can resolve its member/size etc easily. > >>> > >>> Since all name_entry array starting from index 0, which cannot identify a > >>> btf > >>> type globally. So a uniq id is used, its value is accumulated by the > >>> location > >>> of the btf_file within btf_file_array and total quantity of btf types > >>> within > >>> each btf_file. > >>> > >>> Signed-off-by: Tao Liu <[email protected]> > >> > >> Hi Tao, > >> > >> I haven't read this patch in detail, but I wanted to ask if you had any > >> particular reason to avoid using libbpf for parsing the BTF? > > > > Actually I manually parse BTF data due to the following concerns: > > > > 1) Learn BTF data structure by manually parsing. I haven't worked on > > BTF previously, and I prefer the way of doing things from scratch in > > order to learn as fast and as deeply. I didn't do any project prior to > > this one relating to BTF. So I started manual parsing and kept it in > > the patchset. > > > > 2) Let everything be under control. Before actually working on this, I > > know that the BTF/kallsyms parsing will be running in 2nd kernel, so > > it should have a good balance of 1) as fast, 2) less memory > > consumption. So I created 2 ways for BTF resolving: 1) Given id of btf > > type, resolve it by array index(some meta data of BTF types are > > organized in array, so easy to resolve the corresponding BTF type by > > the meta data); 2) Given name of btf type, resolve it by hash table. > > Like I said, only necessary data is stored within memory, others are > > left as disk files to read when needed. I'm not saying I did a better > > job than libbpf, my approach is more white-box to me, and easy to > > improve. > > > > 3) I'm happy to improve the patchset, if libbpf is better in 2nd > > kernel + eppic case, I can replace all those into libbpf APIs. > > > >> > >> For my implementation of BTF parsing for drgn I was not aware of it, so > >> I did the parsing entirely manually, similar to this. But for my recent > >> attempt[1] at adding BTF support to makedumpfile (unaware of this patch), I > >> used libbpf and found that it did reduce the amount of required code by > >> a lot. These are specific functions I found useful: > >> > >> - btf__new() -> returns a "struct btf *" for use with the library > >> - btf__type_by_id() -> returns the "struct btf_type *". Then, the > >> linux/btf.h header provides several inline functions to work with it: > >> - btf_members(), btf_vlen() > >> - btf_member_bit_offset() > >> - btf__pointer_size() > >> - btf__name_by_offset() -> returns the name for a string offset > >> - btf__find_by_name() & btf__find_by_name_kind() -> looks up a type by > >> name. Note that this does a linear search, so your hash table approach > >> is better for random lookups. > >> - btf__resolve_size() > >> > >> So with libbpf, the only thing you'd really need to implement is > >> resolving member offsets (supporting anonyomus structs, recursively, > >> etc), and a hash table for string lookups, if you feel it's necessary. > >> For module BTF, btf__new_split() is available as well. > > > > Thanks for providing the info, I can give it a try and do a > > performance/memory consumption measurement. In the meantime, I'm keen > > Thank you for the information and thoughts. > > Considering its maintenance, I think it would be better to use the > library if possible. It's hard for others than the auther to maintain > scratch code and follow the changes of BTF. If you don't want to use > the library, please give us performance/memory consumption data or > something to convince that the library cannot be used. >
Thanks for your comments. I will give the library a try and see which approach has better performance/memory balance in v3. Thanks, Tao Liu > Thanks, > Kazu > > > for any suggestions on the whole BTF/kallsyms + eppic approach for > > makedumpfile customization. I guess there are cases other than GPU mm > > filtering which are demanding for extending the current page-flag > > filtering of makedumpfile. > > > > Thanks, > > Tao Liu > > > >> > >> Thanks, > >> Stephen > >> > >> [1]: > >> https://github.com/brenns10/makedumpfile/commit/98129399a3a10ae72408bc4aaec2485f7d220626#diff-e1e6bd59ae956df7d2d42ad3580bd3c04da533f1b90827a04e2cc27bbf24b2a7 > >>
