Hi Liu

I'm sorry it took so long.

Your proposed issue with regard to GPU memory is useful, but it may not
be a frequently used feature. I think it would be better to use an external
library, script, or option whenever possible.

As you already know, I am concerned that code related to btf, kallsyms and
maple tree has been ported. These newer features are appealing. However, I
feel that sharing code with crash will increase future maintenance efforts
if we try to maintain consistency. On the other hand, if they are not
identical, maintenance will be less frequent and the feature may
not be updated.

I also think that the eppic script would be better treated as an additional
script. Although some time has passed since the eppic function was added,
the addition of new scripts and updates to supported versions are limited.

I think it would be better to add documentation introducing the script as
a related script rather than including it directly in the makedumpfile
repository.

However, this proposal also includes suggestions for fixing some bugs and
speeding up existing eppic functions. It seems that these things need to be
incorporated.

In other words, I think it's better to separate things that can be 
separated.

Sincerely

On 2025/08/11 9:04, Tao Liu wrote:
> Hi YAMAZAKI,
>
> Thanks for your comments.
>
> On Thu, Aug 7, 2025 at 11:42 PM YAMAZAKI MASAMITSU(山崎 真光)
> <yamazaki-m...@nec.com> wrote:
>> Thank you for the suggestion.
>> I think it's a good idea,
>> but epppic needs careful consideration.
> Do you mean there are drawbacks of using eppic?
>
>  From my side, one advantage of eppic is, users can use the kernel data
> structures/variables directly without redefine it, like:
>
> struct task_struct *p;
> p = (struct task_struct *)&init_task;
>
> Users don't need to include headers to define what task_struct looks
> like, and can use global variable init_task without declaring it. The
> callbacks of eppic can resolve these missing info via
> dwarf/btf/kallsyms during runtime. I think this feature can make the
> eppic scripts tidy and convenient. Also we can use other similar c
> interpreters, but I haven't encountered one with the similar feature,
> also considering eppic has been in makedumpfile for years, so people
> may already be familiar with it...
>
>> I'm sorry, but please let me check for a moment.
>>
> Sure, no problem, please take your time.
>
> Thanks,
> Tao Liu
>
>> Thanks,
>> Masa
>>
>> On 2025/08/05 12:16, Tao Liu wrote:
>>> Kindly ping...
>>>
>>> Any comments for this patchset?
>>>
>>> Thanks,
>>> Tao Liu
>>>
>>>
>>> On Tue, Jun 10, 2025 at 9:57 PM Tao Liu <l...@redhat.com> wrote:
>>>> A) This patchset will introduce the following features to makedumpfile:
>>>>
>>>>     1) Enable eppic script for memory pages filtering.
>>>>     2) Enable btf and kallsyms for symbol type and address resolving.
>>>>     3) Port maple tree data structures and functions, primarily used for
>>>>        vma iteration.
>>>>
>>>> B) The purpose of the features are:
>>>>
>>>>     1) Currently makedumpfile filters mm pages based on page flags, 
>>>> because flags
>>>>        can help to determine one page's usage. But this page-flag-checking 
>>>> method
>>>>        lacks of flexibility in certain cases, e.g. if we want to filter 
>>>> those mm
>>>>        pages occupied by GPU during vmcore dumping due to:
>>>>
>>>>        a) GPU may be taking a large memory and contains sensitive data;
>>>>        b) GPU mm pages have no relations to kernel crash and useless for 
>>>> vmcore
>>>>           analysis.
>>>>
>>>>        But there is no GPU mm page specific flags, and apparently we don't 
>>>> need
>>>>        to create one just for kdump use. A programmable filtering tool is 
>>>> more
>>>>        suitable for such cases. In addition, different GPU vendors may use
>>>>        different ways for mm pages allocating, programmable filtering is 
>>>> better
>>>>        than hard coding these GPU specific logics into makedumpfile in 
>>>> this case.
>>>>
>>>>     2) Currently makedumpfile already contains a programmable filtering 
>>>> tool, aka
>>>>        eppic script, which allows user to write customized code for data 
>>>> erasing.
>>>>        However it has the following drawbacks:
>>>>
>>>>        a) cannot do mm page filtering.
>>>>        b) need to access to debuginfo of both kernel and modules, which is 
>>>> not
>>>>           applicable in the 2nd kernel.
>>>>        c) Poor performance, making vmcore dumping time unacceptable (See
>>>>           the following performance testing).
>>>>
>>>>        makedumpfile need to resolve the dwarf data from debuginfo, to get 
>>>> symbols
>>>>        types and addresses. In recent kernel there are dwarf alternatives 
>>>> such
>>>>        as btf/kallsyms which can be used for this purpose. And 
>>>> btf/kallsyms info
>>>>        are already packed within vmcore, so we can use it directly.
>>>>
>>>>     3) Maple tree data structures are used in recent kernels, such as vma
>>>>        iteration. So maple tree poring is needed.
>>>>
>>>>     With these, this patchset introduces an upgraded eppic, which is based 
>>>> on
>>>>     btf/kallsyms symbol resolving, and is programmable for mm page 
>>>> filtering.
>>>>     The following info shows its usage and performance, please note the 
>>>> tests
>>>>     are performed in 1st kernel:
>>>>
>>>>     $ time ./makedumpfile -d 31 -l 
>>>> /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
>>>>       /tmp/dwarf.out -x 
>>>> /lib/debug/lib/modules/6.11.8-300.fc41.x86_64/vmlinux
>>>>       --eppic eppic_scripts/filter_amdgpu_mm_pages.c
>>>>           real    14m6.894s
>>>>           user    4m16.900s
>>>>           sys     9m44.695s
>>>>
>>>>     $ time ./makedumpfile -d 31 -l 
>>>> /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
>>>>       /tmp/btf.out --eppic eppic_scripts/filter_amdgpu_mm_pages.c
>>>>           real    0m10.672s
>>>>           user    0m9.270s
>>>>           sys     0m1.130s
>>>>
>>>>     -rw------- 1 root root 367475074 Jun 10 18:06 btf.out
>>>>     -rw------- 1 root root 367475074 Jun 10 21:05 dwarf.out
>>>>     -rw-rw-rw- 1 root root 387181418 Jun 10 18:03 
>>>> /var/crash/127.0.0.1-2025-06-10-18:03:12/vmcore
>>>>
>>>> C) Discussion:
>>>>
>>>>     1) GPU types: Currently only tested with amdgpu's mm page filtering, 
>>>> others
>>>>        are not tested.
>>>>     2) Code structure: There are some similar code shared by makedumpfile 
>>>> and
>>>>        crash, such as maple tree data structure, also I planed to port the
>>>>        btf/kallsyms code to crash as well, so there are code duplications 
>>>> for
>>>>        crash & makedumpfile. Since I havn't working on crash poring, code 
>>>> change
>>>>        on btf/kallsyms is expected. How can we share the code, creating a 
>>>> common
>>>>        library or keep the duplication as it is?
>>>>     3) OS: The code can work on rhel-10+/rhel9.5+ on 
>>>> x86_64/arm64/s390/ppc64.
>>>>        Others are not tested.
>>>>
>>>> D) Testing:
>>>>
>>>>     1) If you don't want to create your vmcore, you can find a vmcore 
>>>> which I
>>>>        created with amdgpu mm pages unfiltered [1], the amdgpu mm pages are
>>>>        allocated by program [2]. You can use the vmcore in 1st kernel to 
>>>> filter
>>>>        the amdgpu mm pages by the previous performance testing cmdline. To
>>>>        verify the pages are filtered in crash:
>>>>
>>>>        Unfiltered:
>>>>        crash> search -c "!QAZXSW@#EDC"
>>>>        ffff96b7fa800000: 
>>>> !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>>>>        ffff96b87c800000: 
>>>> !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>>>>        crash> rd ffff96b7fa800000
>>>>        ffff96b7fa800000:  405753585a415121                    !QAZXSW@
>>>>        crash> rd ffff96b87c800000
>>>>        ffff96b87c800000:  405753585a415121                    !QAZXSW@
>>>>
>>>>        Filtered:
>>>>        crash> search -c "!QAZXSW@#EDC"
>>>>        crash> rd ffff96b7fa800000
>>>>        rd: page excluded: kernel virtual address: ffff96b7fa800000  type: 
>>>> "64-bit KVADDR"
>>>>        crash> rd ffff96b87c800000
>>>>        rd: page excluded: kernel virtual address: ffff96b87c800000  type: 
>>>> "64-bit KVADDR"
>>>>
>>>>     2) You can use eppic_scripts/print_all_vma.c against an ordinary 
>>>> vmcore to
>>>>        test only btf/kallsyms functions by output all VMAs if no amdgpu
>>>>        vmcores/machine avaliable.
>>>>
>>>> [1]: https://people.redhat.com/~ltao/core/
>>>> [2]: https://gist.github.com/liutgnu/a8cbce1c666452f1530e1410d1f352df
>>>>
>>>> Tao Liu (10):
>>>>     dwarf_info: Support kernel address randomization
>>>>     dwarf_info: Fix a infinite recursion bug for search_domain
>>>>     Add page filtering function
>>>>     Add btf/kallsyms support for symbol type/address resolving
>>>>     Export necessary btf/kallsyms functions to eppic extension
>>>>     Port the maple tree data structures and functions
>>>>     Supporting main() as the entry of eppic script
>>>>     Enable page filtering for dwarf eppic
>>>>     Enable page filtering for btf/kallsyms eppic
>>>>     Introducing 2 eppic scripts to test the dwarf/btf eppic extension
>>>>
>>>>    Makefile                               |   6 +-
>>>>    btf.c                                  | 919 +++++++++++++++++++++++++
>>>>    btf.h                                  | 176 +++++
>>>>    dwarf_info.c                           |  15 +-
>>>>    eppic_maple.c                          | 431 ++++++++++++
>>>>    eppic_maple.h                          |   8 +
>>>>    eppic_scripts/filter_amdgpu_mm_pages.c |  36 +
>>>>    eppic_scripts/print_all_vma.c          |  29 +
>>>>    erase_info.c                           | 123 +++-
>>>>    erase_info.h                           |  22 +
>>>>    extension_btf.c                        | 218 ++++++
>>>>    extension_eppic.c                      |  41 +-
>>>>    extension_eppic.h                      |   6 +-
>>>>    kallsyms.c                             | 371 ++++++++++
>>>>    kallsyms.h                             |  42 ++
>>>>    makedumpfile.c                         |  21 +-
>>>>    makedumpfile.h                         |  11 +
>>>>    17 files changed, 2448 insertions(+), 27 deletions(-)
>>>>    create mode 100644 btf.c
>>>>    create mode 100644 btf.h
>>>>    create mode 100644 eppic_maple.c
>>>>    create mode 100644 eppic_maple.h
>>>>    create mode 100644 eppic_scripts/filter_amdgpu_mm_pages.c
>>>>    create mode 100644 eppic_scripts/print_all_vma.c
>>>>    create mode 100644 extension_btf.c
>>>>    create mode 100644 kallsyms.c
>>>>    create mode 100644 kallsyms.h
>>>>
>>>> --
>>>> 2.47.0
>>>>
--
Crash-utility mailing list -- devel@lists.crash-utility.osci.io
To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki

Reply via email to