On 20/05/2026 11:58, Francois Dugast wrote:
On Mon, May 18, 2026 at 04:55:12PM +0100, Matthew Auld wrote:
On 18/05/2026 15:14, Francois Dugast wrote:
When split_block() fails it returns before calling mark_split(), leaving
the block in the FREE state and still linked in the rbtree. The four
err_undo paths then call __gpu_buddy_free() without first removing the
block from the tree, which leads to two distinct bugs:
- If the buddy is also free, __gpu_buddy_free() merges the two siblings
by calling gpu_block_free(mm, block) while block->rb is still linked
in the tree. Any subsequent rbtree traversal will follow the now-
dangling pointer, causing a use-after-free.
- In alloc_from_freetree(), where there is no buddy guard,
__gpu_buddy_free() always reaches mark_free() -> rbtree_insert() with
block still in the tree, corrupting the rbtree.
The same pattern is already used correctly in __force_merge(): call
rbtree_remove() to unlink the block before handing it to
__gpu_buddy_free(). Apply the same fix to all four err_undo sites.
Reported-by: Sashiko <[email protected]>
Signed-off-by: Francois Dugast <[email protected]>
Assisted-by: GitHub Copilot:claude-sonnet-4.6
---
drivers/gpu/buddy.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index eb1457376307..dac2027bb64a 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -737,8 +737,10 @@ __alloc_range_bias(struct gpu_buddy *mm,
buddy = __get_buddy(block);
if (buddy &&
(gpu_buddy_block_is_free(block) &&
- gpu_buddy_block_is_free(buddy)))
+ gpu_buddy_block_is_free(buddy))) {
+ rbtree_remove(mm, block);
__gpu_buddy_free(mm, block, false);
+ }
return ERR_PTR(err);
}
@@ -847,8 +849,10 @@ alloc_from_freetree(struct gpu_buddy *mm,
return block;
err_undo:
- if (tmp != order)
+ if (tmp != order) {
+ rbtree_remove(mm, block);
Actually, I think this needs the same checking like elsewhere? Say we fail
on the first split? Nothing was actually split, right?
I think this is unnecessary: for block this is tested above with
BUG_ON(!gpu_buddy_block_is_free(block)). If split_block() fails then it
happens before mark_split() so block remains free. If buddy is not free
then the merge loop is skipped in __gpu_buddy_free() but mark_free() is
called so we do remove + re-insert.
Also, the checks are added with patch #3 and the introduction of
__gpu_buddy_undo_splits().
Right, makes sense.
Francois
__gpu_buddy_free(mm, block, false);
+ }
return ERR_PTR(err);
}
@@ -968,8 +972,10 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
buddy = __get_buddy(block);
if (buddy &&
(gpu_buddy_block_is_free(block) &&
- gpu_buddy_block_is_free(buddy)))
+ gpu_buddy_block_is_free(buddy))) {
+ rbtree_remove(mm, block);
__gpu_buddy_free(mm, block, false);
+ }
return ERR_PTR(err);
}
@@ -1054,8 +1060,10 @@ static int __alloc_range(struct gpu_buddy *mm,
buddy = __get_buddy(block);
if (buddy &&
(gpu_buddy_block_is_free(block) &&
- gpu_buddy_block_is_free(buddy)))
+ gpu_buddy_block_is_free(buddy))) {
+ rbtree_remove(mm, block);
__gpu_buddy_free(mm, block, false);
+ }
err_free:
if (err == -ENOSPC && total_allocated_on_err) {