Found and fixed. If nobody has any more objections are batter names for the new flags I'm going to push this to amd-staging-drm-next tomorrow.

Thanks,
Christian.

Am 17.05.22 um 08:33 schrieb Christian König:
Ok that sounds like a rather simple bug. I will try to take a look.

Thanks,
Christian.

Am 17.05.22 um 02:12 schrieb Marek Olšák:
Dmesg doesn't contain anything. There is no backtrace because it's not a crash. The VA map ioctl just fails with the new flag. It looks like the flag is considered invalid.

Marek

On Mon., May 16, 2022, 12:13 Christian König, <ckoenig.leichtzumer...@gmail.com> wrote:

    I don't have access to any gfx10 hardware.

    Can you give me a dmesg and/or backtrace, etc..?

    I can't push this unless it's working properly.

    Christian.

    Am 16.05.22 um 14:56 schrieb Marek Olšák:
    Reproduction steps:
    - use mesa/main on gfx10.3 (not sure what other GPUs do)
    - run: radeonsi_mall_noalloc=true glxgears

    Marek

    On Mon, May 16, 2022 at 7:53 AM Christian König
    <ckoenig.leichtzumer...@gmail.com> wrote:

        Crap, do you have a link to the failure?

        Am 16.05.22 um 13:10 schrieb Marek Olšák:
        I forgot to say: The NOALLOC flag causes an allocation
        failure, so there is a kernel bug somewhere.

        Marek

        On Mon, May 16, 2022 at 7:06 AM Marek Olšák
        <mar...@gmail.com> wrote:

            FYI, I think it's time to merge this because the Mesa
            commits are going to be merged in ~30 minutes if Gitlab
            CI is green, and that includes updated amdgpu_drm.h.

            Marek

            On Wed, May 11, 2022 at 2:55 PM Marek Olšák
            <mar...@gmail.com> wrote:

                Ok sounds good.

                Marek

                On Wed., May 11, 2022, 03:43 Christian König,
                <ckoenig.leichtzumer...@gmail.com> wrote:

                    It really *is* a NOALLOC feature. In other
                    words there is no latency improvement on reads
                    because the cache is always checked, even with
                    the noalloc flag set.

                    The only thing it affects is that misses not
                    enter the cache and so don't cause any
                    additional pressure on evicting cache lines.

                    You might want to double check with the
                    hardware guys, but I'm something like 95% sure
                    that it works this way.

                    Christian.

                    Am 11.05.22 um 09:22 schrieb Marek Olšák:
                    Bypass means that the contents of the cache
                    are ignored, which decreases latency at the
                    cost of no coherency between bypassed and
                    normal memory requests. NOA (noalloc) means
                    that the cache is checked and can give you
                    cache hits, but misses are not cached and the
                    overall latency is higher. I don't know what
                    the hw does, but I hope it was misnamed and it
                    really means bypass because there is no point
                    in doing cache lookups on every memory request
                    if the driver wants to disable caching to
                    *decrease* latency in the situations when the
                    cache isn't helping.

                    Marek

                    On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo
                    <lijo.la...@amd.com> wrote:



                        On 5/11/2022 11:36 AM, Christian König wrote:
                        > Mhm, it doesn't really bypass MALL. It
                        just doesn't allocate any MALL
                        > entries on write.
                        >
                        > How about AMDGPU_VM_PAGE_NO_MALL ?

                        One more - AMDGPU_VM_PAGE_LLC_* [ LLC =
                        last level cache, * = some sort
                        of attribute which decides LLC behaviour]

                        Thanks,
                        Lijo

                        >
                        > Christian.
                        >
                        > Am 10.05.22 um 23:21 schrieb Marek Olšák:
                        >> A better name would be:
                        >> AMDGPU_VM_PAGE_BYPASS_MALL
                        >>
                        >> Marek
                        >>
                        >> On Fri, May 6, 2022 at 7:23 AM
                        Christian König
                        >> <ckoenig.leichtzumer...@gmail.com> wrote:
                        >>
                        >>     Add the AMDGPU_VM_NOALLOC flag to
                        let userspace control MALL
                        >>     allocation.
                        >>
                        >>     Only compile tested!
                        >>
                        >>  Signed-off-by: Christian König
                        <christian.koe...@amd.com>
                        >>     ---
                        >>
                          drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
                        | 2 ++
                        >>
                          drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |
                        3 +++
                        >>
                          drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c |
                        3 +++
                        >>   include/uapi/drm/amdgpu_drm.h        
                         | 2 ++
                        >>      4 files changed, 10 insertions(+)
                        >>
                        >>     diff --git
                        a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
                        >>  b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
                        >>     index bf97d8f07f57..d8129626581f 100644
                        >>     ---
                        a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
                        >>     +++
                        b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
                        >>     @@ -650,6 +650,8 @@ uint64_t
                        amdgpu_gem_va_map_flags(struct
                        >>  amdgpu_device *adev, uint32_t flags)
                        >>     pte_flag |= AMDGPU_PTE_WRITEABLE;
                        >>             if (flags & AMDGPU_VM_PAGE_PRT)
                        >>     pte_flag |= AMDGPU_PTE_PRT;
                        >>     +       if (flags &
                        AMDGPU_VM_PAGE_NOALLOC)
                        >>     +      pte_flag |= AMDGPU_PTE_NOALLOC;
                        >>
                        >>             if
                        (adev->gmc.gmc_funcs->map_mtype)
                        >>     pte_flag |= amdgpu_gmc_map_mtype(adev,
                        >>     diff --git
                        a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
                        >>  b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
                        >>     index b8c79789e1e4..9077dfccaf3c 100644
                        >>     ---
                        a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
                        >>     +++
                        b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
                        >>     @@ -613,6 +613,9 @@ static void
                        gmc_v10_0_get_vm_pte(struct
                        >>  amdgpu_device *adev,
                        >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
                        >> *flags |= (mapping->flags &
                        AMDGPU_PTE_MTYPE_NV10_MASK);
                        >>
                        >>     +  *flags &= ~AMDGPU_PTE_NOALLOC;
                        >>     +  *flags |= (mapping->flags &
                        AMDGPU_PTE_NOALLOC);
                        >>     +
                        >>             if (mapping->flags &
                        AMDGPU_PTE_PRT) {
                        >>     *flags |= AMDGPU_PTE_PRT;
                        >>     *flags |= AMDGPU_PTE_SNOOPED;
                        >>     diff --git
                        a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
                        >>  b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
                        >>     index 8d733eeac556..32ee56adb602 100644
                        >>     ---
                        a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
                        >>     +++
                        b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
                        >>     @@ -508,6 +508,9 @@ static void
                        gmc_v11_0_get_vm_pte(struct
                        >>  amdgpu_device *adev,
                        >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
                        >> *flags |= (mapping->flags &
                        AMDGPU_PTE_MTYPE_NV10_MASK);
                        >>
                        >>     +  *flags &= ~AMDGPU_PTE_NOALLOC;
                        >>     +  *flags |= (mapping->flags &
                        AMDGPU_PTE_NOALLOC);
                        >>     +
                        >>             if (mapping->flags &
                        AMDGPU_PTE_PRT) {
                        >>     *flags |= AMDGPU_PTE_PRT;
                        >>     *flags |= AMDGPU_PTE_SNOOPED;
                        >>     diff --git
                        a/include/uapi/drm/amdgpu_drm.h
                        >>  b/include/uapi/drm/amdgpu_drm.h
                        >>     index 57b9d8f0133a..9d71d6330687 100644
                        >>     --- a/include/uapi/drm/amdgpu_drm.h
                        >>     +++ b/include/uapi/drm/amdgpu_drm.h
                        >>     @@ -533,6 +533,8 @@ struct
                        drm_amdgpu_gem_op {
                        >>      #define AMDGPU_VM_MTYPE_UC      
                         (4 << 5)
                        >>      /* Use Read Write MTYPE instead of
                        default MTYPE */
                        >>      #define AMDGPU_VM_MTYPE_RW      
                         (5 << 5)
                        >>     +/* don't allocate MALL */
                        >>     +#define AMDGPU_VM_PAGE_NOALLOC    
                           (1 << 9)
                        >>
                        >>      struct drm_amdgpu_gem_va {
                        >>             /** GEM object handle */
                        >>     --
                        >>     2.25.1
                        >>
                        >





Reply via email to