On 2016-08-09 07:52, Christian König wrote:
From: Christian König <[email protected]>

We align to 64KB, but when userspace aligns even more we can easily use more.

Signed-off-by: Christian König <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index e6c030b..88f4109 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -817,13 +817,13 @@ static void amdgpu_vm_frag_ptes(struct
amdgpu_pte_update_params        *params,
         * allocation size to the fragment size.
         */

-       /* SI and newer are optimized for 64KB */
-       uint64_t frag_flags = AMDGPU_PTE_FRAG(AMDGPU_LOG2_PAGES_PER_FRAG);
-       uint64_t frag_align = 1 << AMDGPU_LOG2_PAGES_PER_FRAG;
+       const uint64_t frag_align = 1 << AMDGPU_LOG2_PAGES_PER_FRAG;

        uint64_t frag_start = ALIGN(start, frag_align);
        uint64_t frag_end = end & ~(frag_align - 1);

+       uint32_t frag;
+
        /* system pages are non continuously */
if (params->src || params->pages_addr || !(flags & AMDGPU_PTE_VALID) ||
            (frag_start >= frag_end)) {
@@ -832,6 +832,10 @@ static void amdgpu_vm_frag_ptes(struct
amdgpu_pte_update_params        *params,
                return;
        }

+       /* use more than 64KB fragment size if possible */
+       frag = lower_32_bits(frag_start | frag_end);
+       frag = likely(frag) ? __ffs(frag) : 31;
+
        /* handle the 4K area at the beginning */
        if (start != frag_start) {
                amdgpu_vm_update_ptes(params, vm, start, frag_start,
@@ -841,7 +845,7 @@ static void amdgpu_vm_frag_ptes(struct
amdgpu_pte_update_params        *params,

        /* handle the area in the middle */
        amdgpu_vm_update_ptes(params, vm, frag_start, frag_end, dst,
-                             flags | frag_flags);
+                             flags | AMDGPU_PTE_FRAG(frag));

        /* handle the 4K area at the end */
        if (frag_end != end) {

Would this change not direct larger fragments away from the BigK TLB partition?

My understanding was VM_L2_CNTL3.L2_CACHE_BIGK_FRAGMENT_SIZE is an exact match and not a minimum size. I can't find any immediate documentation on that topic to confirm.

--
Jay Cornwall
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to