On 20/05/2026 11:58, Francois Dugast wrote:
On Mon, May 18, 2026 at 04:55:12PM +0100, Matthew Auld wrote:
On 18/05/2026 15:14, Francois Dugast wrote:
When split_block() fails it returns before calling mark_split(), leaving
the block in the FREE state and still linked in the rbtree.  The four
err_undo paths then call __gpu_buddy_free() without first removing the
block from the tree, which leads to two distinct bugs:

   - If the buddy is also free, __gpu_buddy_free() merges the two siblings
     by calling gpu_block_free(mm, block) while block->rb is still linked
     in the tree.  Any subsequent rbtree traversal will follow the now-
     dangling pointer, causing a use-after-free.

   - In alloc_from_freetree(), where there is no buddy guard,
     __gpu_buddy_free() always reaches mark_free() -> rbtree_insert() with
     block still in the tree, corrupting the rbtree.

The same pattern is already used correctly in __force_merge(): call
rbtree_remove() to unlink the block before handing it to
__gpu_buddy_free().  Apply the same fix to all four err_undo sites.

Reported-by: Sashiko <[email protected]>
Signed-off-by: Francois Dugast <[email protected]>
Assisted-by: GitHub Copilot:claude-sonnet-4.6
---
   drivers/gpu/buddy.c | 16 ++++++++++++----
   1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index eb1457376307..dac2027bb64a 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -737,8 +737,10 @@ __alloc_range_bias(struct gpu_buddy *mm,
        buddy = __get_buddy(block);
        if (buddy &&
            (gpu_buddy_block_is_free(block) &&
-            gpu_buddy_block_is_free(buddy)))
+            gpu_buddy_block_is_free(buddy))) {
+               rbtree_remove(mm, block);
                __gpu_buddy_free(mm, block, false);
+       }
        return ERR_PTR(err);
   }
@@ -847,8 +849,10 @@ alloc_from_freetree(struct gpu_buddy *mm,
        return block;
   err_undo:
-       if (tmp != order)
+       if (tmp != order) {
+               rbtree_remove(mm, block);

Actually, I think this needs the same checking like elsewhere? Say we fail
on the first split? Nothing was actually split, right?

I think this is unnecessary: for block this is tested above with
BUG_ON(!gpu_buddy_block_is_free(block)). If split_block() fails then it
happens before mark_split() so block remains free. If buddy is not free
then the merge loop is skipped in __gpu_buddy_free() but mark_free() is
called so we do remove + re-insert.

Also, the checks are added with patch #3 and the introduction of
__gpu_buddy_undo_splits().

Right, makes sense.


Francois


                __gpu_buddy_free(mm, block, false);
+       }
        return ERR_PTR(err);
   }
@@ -968,8 +972,10 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
        buddy = __get_buddy(block);
        if (buddy &&
            (gpu_buddy_block_is_free(block) &&
-            gpu_buddy_block_is_free(buddy)))
+            gpu_buddy_block_is_free(buddy))) {
+               rbtree_remove(mm, block);
                __gpu_buddy_free(mm, block, false);
+       }
        return ERR_PTR(err);
   }
@@ -1054,8 +1060,10 @@ static int __alloc_range(struct gpu_buddy *mm,
        buddy = __get_buddy(block);
        if (buddy &&
            (gpu_buddy_block_is_free(block) &&
-            gpu_buddy_block_is_free(buddy)))
+            gpu_buddy_block_is_free(buddy))) {
+               rbtree_remove(mm, block);
                __gpu_buddy_free(mm, block, false);
+       }
   err_free:
        if (err == -ENOSPC && total_allocated_on_err) {


Reply via email to