amdgpu: restructure VM state machine

Tvrtko Ursulin Mon, 16 Mar 2026 07:26:19 -0700


On 16/03/2026 13:44, Christian König wrote:

Hi,

On 3/12/26 15:32, Tvrtko Ursulin wrote:


On 10/03/2026 19:13, Christian König wrote:

Instead of comming up with more sophisticated names for states a VM BO


s/comming/coming/, and maybe a comma after "in" in the row below.

can be in group them by the type of BO first and then by the state.

So we end with BO type kernel, shared_resv and individual_resv and then
states evicted, moved and idle.

Not much functional change, except that evicted_user is moved back
together with the other BOs again which makes the handling in
amdgpu_vm_validate() a bit more complex. Also fixes a problem with user
queues and amdgpu_vm_ready().


It would be good to describe the problem at least a little bit, especially if 
it was a significant reason for the patch.


Thanks for the comments, fixed in the next version.

-static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
+/* Eventually unlock the status list lock again */
+static void amdgpu_vm_bo_unlock_lists(struct amdgpu_vm_bo_base *vm_bo)
   {
-    spin_lock(&vm_bo->vm->invalidated_lock);
-    list_move(&vm_bo->vm_status, &vm_bo->vm->invalidated);
-    spin_unlock(&vm_bo->vm->invalidated_lock);
+    if (!amdgpu_vm_is_bo_always_valid(vm_bo->vm, vm_bo->bo))
+        spin_unlock(&vm_bo->vm->invalidated_lock);


Worth putting an assert vm is locked on the else path? Could be given the new 
API (amdgpu_vm_bo_lock_lists) is a bit odd in that for always valid it expects 
vm already locked and for other takes a different lock.


Good point, fixed as well.

@@ -412,14 +384,16 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
       amdgpu_vm_update_stats_locked(base, bo->tbo.resource, +1);
       spin_unlock(&vm->stats_lock);
   -    if (!amdgpu_vm_is_bo_always_valid(vm, bo))
+    if (!amdgpu_vm_is_bo_always_valid(vm, bo)) {
+        amdgpu_vm_bo_idle(base);


Hm on what list is it today?


None, the status member was just initialized with INIT_LIST_HEAD().

Since it doesn't has any mappings yet putting it on the idle list sounds 
perfectly valid to me.


Sounds plausible. I asked becuase I detected an unexpected change.

   int amdgpu_vm_lock_done_list(struct amdgpu_vm *vm, struct drm_exec *exec,
                    unsigned int num_fences)
   {
-    struct list_head *prev = &vm->done;
+    struct list_head *prev = &vm->individual_resv.idle;


Should this access be under the lock?


No, prev is the pointer which is always valid. In this case here pointing to 
the list head.

Ah okay, I see now that the prev local is overloaded, which I initiallymissed.


Only prev->next can only be trusted while holding the spinlock. That is also 
documented by the comment below.

       struct amdgpu_bo_va *bo_va;
       struct amdgpu_bo *bo;
       int ret;
         /* We can only trust prev->next while holding the lock */
       spin_lock(&vm->invalidated_lock);
-    while (!list_is_head(prev->next, &vm->done)) {
+    while (!list_is_head(prev->next, &vm->individual_resv.idle)) {
           bo_va = list_entry(prev->next, typeof(*bo_va), base.vm_status);
             bo = bo_va->base.bo;
@@ -584,7 +558,6 @@ int amdgpu_vm_validate(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
   {
       uint64_t new_vm_generation = amdgpu_vm_generation(adev, vm);
       struct amdgpu_vm_bo_base *bo_base, *tmp;
-    struct amdgpu_bo *bo;
       int r;
         if (vm->generation != new_vm_generation) {
@@ -596,38 +569,52 @@ int amdgpu_vm_validate(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
               return r;
       }
   -    list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
-        bo = bo_base->bo;
-
-        r = validate(param, bo);
+    list_for_each_entry_safe(bo_base, tmp, &vm->kernel.evicted, vm_status) {
+        r = validate(param, bo_base->bo);
           if (r)
               return r;
   -        if (bo->tbo.type != ttm_bo_type_kernel) {
-            amdgpu_vm_bo_moved(bo_base);
-        } else {
-            vm->update_funcs->map_table(to_amdgpu_bo_vm(bo));
-            amdgpu_vm_bo_relocated(bo_base);
-        }
+        vm->update_funcs->map_table(to_amdgpu_bo_vm(bo_base->bo));
+        amdgpu_vm_bo_moved(bo_base);
       }
   -    if (ticket) {
-        list_for_each_entry_safe(bo_base, tmp, &vm->evicted_user,
-                     vm_status) {
-            bo = bo_base->bo;
-            dma_resv_assert_held(bo->tbo.base.resv);
+    amdgpu_vm_eviction_lock(vm);
+    vm->evicting = false;
+    amdgpu_vm_eviction_unlock(vm);


Is there a specific reason this block is right here and not at the end as today?


The evicting status reflects if the page tables are at the location where they 
should be for an updated (VRAM/GTT, accessible by the GPU).

The status of the always_valid/per VM BOs doesn't matter for that.

Got it. Could be worth putting a comment to that effect above thelock-evicting=false-unlock block.

This has caused a few problems for userqueues and is the issue I described in 
the commit message.

   -            r = validate(param, bo);
-            if (r)
-                return r;
+    list_for_each_entry_safe(bo_base, tmp, &vm->shared_resv.evicted,
+                 vm_status) {
+        r = validate(param, bo_base->bo);
+        if (r)
+            return r;
   -            amdgpu_vm_bo_invalidated(bo_base);
-        }
+        amdgpu_vm_bo_moved(bo_base);
       }
   -    amdgpu_vm_eviction_lock(vm);
-    vm->evicting = false;
-    amdgpu_vm_eviction_unlock(vm);
+    if (!ticket)
+        return 0;
+
+    spin_lock(&vm->invalidated_lock);
+    list_for_each_entry(bo_base, &vm->individual_resv.evicted, vm_status) {
+        struct amdgpu_bo *bo = bo_base->bo;
+
+        if (dma_resv_locking_ctx(bo->tbo.base.resv) != ticket)
+            continue;


What is this for?


Only certain elements on the list are locked, we skip the ones which aren't.


Also is probably worth a comment.

+
+        spin_unlock(&vm->invalidated_lock);
+
+        r = validate(param, bo);
+        if (r)
+            return r;
+
+        /* need to grab the invalidated lock to trust prev here */
+        spin_lock(&vm->invalidated_lock);
+        tmp = list_entry(bo_base->vm_status.prev, typeof(*tmp),
+                 vm_status);


Why is this safe? Lock was dropped while the current element was left in the 
list so anything could have happened. Even if current element was unlinked 
before dropping the lock, I don't see how it is safe to assume the previous 
element is still valid. Does it rely on dma-resve being held over the whole 
function? The current method of restarting from the head is certainly easier to 
understand.


The whole list is protected by the spinlock, but individual elements can only 
move while holding their resv lock.

Now what we do is to walk the list until we find one where the resv lock is 
locked individually, so that we know that it is save to drop the spinlock.

We then validate the entry we found and try restart from the entry before the 
one we found.

Going over the list again until we can't find any more is probably easier to 
understand but also less effective.

If prev wasn't locked with the same resv (so would be skipped) itcouldn't have been unlinked and freed in the meantime?

   -    seq_puts(m, "\tRelocated BOs:\n");
-    list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) {
-        if (!bo_va->base.bo)
-            continue;
-        total_relocated += amdgpu_bo_print_info(id++, bo_va->base.bo, m);
+        amdgpu_bo_print_info(id++, base->bo, m);


Probably just thinking out loud - given the format of the debugfs file is 
changing anyway, and that this id is both unstable cat-to-cat and also has no 
relation to the user handle which is output by the only other caller so could 
be confusing/misleading, I wonder if it is even worth bothering with it. For 
example you could pass zero and even modify (or not, optional) 
amdgpu_bo_print_info to skip it if handle is zero.


Yeah I was thinking the same thing. The id is really completely pointless.

But I think that is for a different patch.


Yep.

        *
        * Lists are protected by the invalidated_lock.
        */
       spinlock_t        invalidated_lock;
   -    /* BOs for user mode queues that need a validation */
-    struct list_head    evicted_user;
-
-    /* regular invalidated BOs, but not yet updated in the PT */
-    struct list_head    invalidated;
-
-    /* BOs which are invalidated, has been updated in the PTs */
-    struct list_head        done;
+    /* Userspace BOs with individual resv object */
+    struct amdgpu_vm_bo_status    individual_resv;


Are all state transitions valid for all lists? If not it would be good to put 
that info in the respective comments.


Yes they are.

Okay. I think I had a specific reason why I asked this but it evaporatedfrom my head since Thursday. Never mind.

What is btw the difference between vm_bo->moved == true and fact ofbeing on the moved list, or not?


Regards,

Tvrtko


Thanks for the review,
Christian.

         /*
        * This list contains amdgpu_bo_va_mapping objects which have been freed


Regards,

Tvrtko

Re: [PATCH 02/11] drm/amdgpu: restructure VM state machine

Reply via email to