Dmitry reported syzcaller tripped a use-after-free in perf_release().
After much puzzlement Oleg spotted the below scenario:
Task1 Task2
fork()
perf_event_init_task()
/* ... */
goto bad_fork_$foo;
/* ... */
perf_event_free_task()
mutex_lock(ctx->lock)
perf_free_event(B)
perf_event_release_kernel(A)
mutex_lock(A->child_mutex)
list_for_each_entry(child, ...) {
/* child == B */
ctx = B->ctx;
get_ctx(ctx);
mutex_unlock(A->child_mutex);
mutex_lock(A->child_mutex)
list_del_init(B->child_list)
mutex_unlock(A->child_mutex)
/* ... */
mutex_unlock(ctx->lock);
put_ctx() /* >0 */
free_task();
mutex_lock(ctx->lock);
mutex_lock(A->child_mutex);
/* ... */
mutex_unlock(A->child_mutex);
mutex_unlock(ctx->lock)
put_ctx() /* 0 */
ctx->task && !TOMBSTONE
put_task_struct() /* UAF */
This patch closes the hole by making perf_event_free_task() destroy the
task <-> ctx relation such that perf_event_release_kernel() will no longer
observe the now dead task.
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Reported-by: Dmitry Vyukov <[email protected]>
Spotted-by: Oleg Nesterov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/20170314155949.GE32474@worktop
---
kernel/events/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10556,6 +10556,17 @@ void perf_event_free_task(struct task_st
continue;
mutex_lock(&ctx->mutex);
+ raw_spin_lock_irq(&ctx->lock);
+ /*
+ * Destroy the task <-> ctx relation and mark the context dead.
+ *
+ * This is important because even though the task hasn't been
+ * exposed yet the context has been (through child_list).
+ */
+ RCU_INIT_POINTER(task->perf_event_ctxp[ctxn], NULL);
+ WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
+ put_task_struct(task); /* cannot be last */
+ raw_spin_unlock_irq(&ctx->lock);
again:
list_for_each_entry_safe(event, tmp, &ctx->pinned_groups,
group_entry)