On 2025/12/03 9:52, Tao Liu wrote: > Hi Stephen, > > Thanks for your comments! > > On Wed, Dec 3, 2025 at 11:00 AM Stephen Brennan > <[email protected]> wrote: >> >> Tao Liu <[email protected]> writes: >>> This patch will parse kernel's btf data. The data can be located via >>> __start_BTF and __stop_BTF symbols which have been resolved by kallsyms of >>> the previous patch. >>> >>> The btf data is organized as follows: each one of kernel modules, including >>> vmlinux itself, will have a btf_file struct describing its btf data, and >>> holding an array of struct name_entry. By given the btf type id, we can >>> resolve >>> its name_entry fast by array index. In addition, name_entry can also be >>> organized in hash table. So given a type name, we can also resolve its >>> name_entry in a fast speed. In other words, both btf type id and btf >>> type name can we get it resolved fast. Once we get the name_entry structure >>> of >>> the btf type, we can resolve its member/size etc easily. >>> >>> Since all name_entry array starting from index 0, which cannot identify a >>> btf >>> type globally. So a uniq id is used, its value is accumulated by the >>> location >>> of the btf_file within btf_file_array and total quantity of btf types within >>> each btf_file. >>> >>> Signed-off-by: Tao Liu <[email protected]> >> >> Hi Tao, >> >> I haven't read this patch in detail, but I wanted to ask if you had any >> particular reason to avoid using libbpf for parsing the BTF? > > Actually I manually parse BTF data due to the following concerns: > > 1) Learn BTF data structure by manually parsing. I haven't worked on > BTF previously, and I prefer the way of doing things from scratch in > order to learn as fast and as deeply. I didn't do any project prior to > this one relating to BTF. So I started manual parsing and kept it in > the patchset. > > 2) Let everything be under control. Before actually working on this, I > know that the BTF/kallsyms parsing will be running in 2nd kernel, so > it should have a good balance of 1) as fast, 2) less memory > consumption. So I created 2 ways for BTF resolving: 1) Given id of btf > type, resolve it by array index(some meta data of BTF types are > organized in array, so easy to resolve the corresponding BTF type by > the meta data); 2) Given name of btf type, resolve it by hash table. > Like I said, only necessary data is stored within memory, others are > left as disk files to read when needed. I'm not saying I did a better > job than libbpf, my approach is more white-box to me, and easy to > improve. > > 3) I'm happy to improve the patchset, if libbpf is better in 2nd > kernel + eppic case, I can replace all those into libbpf APIs. > >> >> For my implementation of BTF parsing for drgn I was not aware of it, so >> I did the parsing entirely manually, similar to this. But for my recent >> attempt[1] at adding BTF support to makedumpfile (unaware of this patch), I >> used libbpf and found that it did reduce the amount of required code by >> a lot. These are specific functions I found useful: >> >> - btf__new() -> returns a "struct btf *" for use with the library >> - btf__type_by_id() -> returns the "struct btf_type *". Then, the >> linux/btf.h header provides several inline functions to work with it: >> - btf_members(), btf_vlen() >> - btf_member_bit_offset() >> - btf__pointer_size() >> - btf__name_by_offset() -> returns the name for a string offset >> - btf__find_by_name() & btf__find_by_name_kind() -> looks up a type by >> name. Note that this does a linear search, so your hash table approach >> is better for random lookups. >> - btf__resolve_size() >> >> So with libbpf, the only thing you'd really need to implement is >> resolving member offsets (supporting anonyomus structs, recursively, >> etc), and a hash table for string lookups, if you feel it's necessary. >> For module BTF, btf__new_split() is available as well. > > Thanks for providing the info, I can give it a try and do a > performance/memory consumption measurement. In the meantime, I'm keen
Thank you for the information and thoughts. Considering its maintenance, I think it would be better to use the library if possible. It's hard for others than the auther to maintain scratch code and follow the changes of BTF. If you don't want to use the library, please give us performance/memory consumption data or something to convince that the library cannot be used. Thanks, Kazu > for any suggestions on the whole BTF/kallsyms + eppic approach for > makedumpfile customization. I guess there are cases other than GPU mm > filtering which are demanding for extending the current page-flag > filtering of makedumpfile. > > Thanks, > Tao Liu > >> >> Thanks, >> Stephen >> >> [1]: >> https://github.com/brenns10/makedumpfile/commit/98129399a3a10ae72408bc4aaec2485f7d220626#diff-e1e6bd59ae956df7d2d42ad3580bd3c04da533f1b90827a04e2cc27bbf24b2a7 >>
