Hi guys,

yeah that is a well known issue but actually completely harmless.

What happens is that a trace function accesses a stale pointer to print some additional value into the trace log.

That memory might have been reused and the information is now outdated, but the worst thing that can happen is that the value in the logs is nonsense.

I have a patch in the queue to fix this, should be upstream and backported in the next few weeks.

Regards,
Christian.

Am 29.04.24 um 04:15 schrieb Joonkyo Jung:
Hi,

Thank you for patching two of the bugs we have reported!
I was just wondering if there's any news on the one other bug we have reported:
BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550.

I see that there is a gitlab issue(https://gitlab.freedesktop.org/drm/amd/-/issues/3171) created for this bug, and there also is a patch(https://lists.freedesktop.org/archives/amd-gfx/2024-March/105680.html) that Christian made for this. Though, it seems that the issue is not resolved yet, and the patch is not yet pushed to mainstream branches. So I was wondering, do you have any plans for pushing this patch? If so, would it be possible for us to get a Reported-by tag on the patch?

Best,
Joonkyo

On Fri, Mar 8, 2024 at 4:32 PM Joonkyo Jung <joonk...@yonsei.ac.kr> wrote:

    Hi Vitaly,

    No worries, thank you for working on the patches!

    I have also confirmed that with the inflight patch, issue No.1
    (use-after-free) seems to be resolved.
    However, I have reproduced issue No.3 (slab-use-after-free) even
    with the patch for issue No.1 applied - if it's the first program
    tested after reboot.
    (i.e., if any other bugs are tested before the
    slab-use-after-free, it does not reproduce).

    Could you check if the bug reproduces in this condition for you too?
    I will check and see why this is happening and update you if I
    have something new.

    Thank you!

    Best,
    Joonkyo



    On Fri, Mar 8, 2024 at 12:45 PM vitaly prosyak <vpros...@amd.com>
    wrote:

        Hi Joonkyo,
        Sorry for the delay.
        Yes, sure, I reproduced issue 2 (null-ptr-deref in amdgpu) and
        I will provide the fix soon.
        However, issue No. 3 is no longer reproducible if the recent
        patch inflight is applied which fixes issue No 1.

        Do you see the same behavior?

        Thanks in advance, Vitaly

        On 2024-03-07 20:18, Joonkyo Jung wrote:
        Hello,
        thank you for patching the first bug we have sent!

        Just a quick touch base with you, to ask if there has been
        any update on our other two bugs.
        They were each sent with emails titled
        "Reporting a slab-use-after-free in amdgpu" (this one)
        "Reporting a null-ptr-deref in amdgpu".

        Thank you!

        Best,
        Joonkyo


        2024년 2월 16일 (금) 오후 6:22, Joonkyo Jung
        <joonk...@yonsei.ac.kr>님이 작성:

            Hello,

            We would like to report a slab-use-after-free bug in the
            AMDGPU DRM driver in the linux kernel v6.8-rc4 that we
            found with our customized Syzkaller.
            The bug can be triggered by sending two ioctls to the
            AMDGPU DRM driver in succession.

            In amdgpu_bo_move, struct ttm_resource *old_mem =
            bo->resource is assigned.
            As you can see on the alloc & free stack calls, on the
            same function amdgpu_bo_move,
            amdgpu_move_blit in the end frees bo->resource at
            ttm_bo_move_accel_cleanup with ttm_bo_wait_free_node(bo,
            man->use_tt).
            But amdgpu_bo_move continues after that, reaching
            trace_amdgpu_bo_move(abo, new_mem->mem_type,
            old_mem->mem_type) at the end, causing the use-after-free
            bug.

            Steps to reproduce are as below.
            union drm_amdgpu_gem_create *arg1;

            arg1 = malloc(sizeof(union drm_amdgpu_gem_create));
            arg1->in.bo_size = 0x8;
            arg1->in.alignment = 0x0;
            arg1->in.domains = 0x4;
            arg1->in.domain_flags = 0x9;
            ioctl(fd, 0xc0206440, arg1);

            arg1->in.bo_size = 0x7fffffff;
            arg1->in.alignment = 0x0;
            arg1->in.domains = 0x4;
            arg1->in.domain_flags = 0x9;
            ioctl(fd, 0xc0206440, arg1);

            The KASAN report is as follows:
            ==================================================================
            BUG: KASAN: slab-use-after-free in
            amdgpu_bo_move+0x1479/0x1550
            Read of size 4 at addr ffff88800f5bee80 by task
            syz-executor/219
            Call Trace:
             <TASK>
             amdgpu_bo_move+0x1479/0x1550
             ttm_bo_handle_move_mem+0x4d0/0x700
             ttm_mem_evict_first+0x945/0x1230
             ttm_bo_mem_space+0x6c7/0x940
             ttm_bo_validate+0x286/0x650
             ttm_bo_init_reserved+0x34c/0x490
             amdgpu_bo_create+0x94b/0x1610
             amdgpu_bo_create_user+0xa3/0x130
             amdgpu_gem_create_ioctl+0x4bc/0xc10
             drm_ioctl_kernel+0x300/0x410
             drm_ioctl+0x648/0xb30
             amdgpu_drm_ioctl+0xc8/0x160
             </TASK>

            Allocated by task 219:
             kmalloc_trace+0x211/0x390
             amdgpu_vram_mgr_new+0x1d6/0xbe0
             ttm_resource_alloc+0xfd/0x1e0
             ttm_bo_mem_space+0x255/0x940
             ttm_bo_validate+0x286/0x650
             ttm_bo_init_reserved+0x34c/0x490
             amdgpu_bo_create+0x94b/0x1610
             amdgpu_bo_create_user+0xa3/0x130
             amdgpu_gem_create_ioctl+0x4bc/0xc10
             drm_ioctl_kernel+0x300/0x410
             drm_ioctl+0x648/0xb30
             amdgpu_drm_ioctl+0xc8/0x160

            Freed by task 219:
             kfree+0x111/0x2d0
             ttm_resource_free+0x17e/0x1e0
             ttm_bo_move_accel_cleanup+0x77e/0x9b0
             amdgpu_move_blit+0x3db/0x670
             amdgpu_bo_move+0xfa2/0x1550
             ttm_bo_handle_move_mem+0x4d0/0x700
             ttm_mem_evict_first+0x945/0x1230
             ttm_bo_mem_space+0x6c7/0x940
             ttm_bo_validate+0x286/0x650
             ttm_bo_init_reserved+0x34c/0x490
             amdgpu_bo_create+0x94b/0x1610
             amdgpu_bo_create_user+0xa3/0x130
             amdgpu_gem_create_ioctl+0x4bc/0xc10
             drm_ioctl_kernel+0x300/0x410
             drm_ioctl+0x648/0xb30
             amdgpu_drm_ioctl+0xc8/0x160

            The buggy address belongs to the object at ffff88800f5bee70
             which belongs to the cache kmalloc-96 of size 96
            The buggy address is located 16 bytes inside of
             freed 96-byte region [ffff88800f5bee70, ffff88800f5beed0)

            Should you need any more information, please do not
            hesitate to contact us.

            Best regards,
            Joonkyo Jung

Reply via email to