http://marc.info/?l=linux-kernel&m=123378709730700&w=2

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    2.6.29-rc1: i915 lockdep warning
From:       "Brandeburg, Jesse" <jesse.brandeburg () intel ! com>
Date:       2009-01-13 23:17:06
Message-ID: F169D4F5E1F1974DBFAFABF47F60C10A19D36AEF () orsmsx507 ! amr ! corp ! intel ! com
[Download message RAW]

[drm] Initialized i915 1.6.0 20080730 on minor 0

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.29-rc1 #1
-------------------------------------------------------
Xorg/2810 is trying to acquire lock:
 (&mm->mmap_sem){----}, at: [<c047d508>] might_fault+0x40/0x7c

but task is already holding lock:
 (&dev->struct_mutex){--..}, at: [<f7fd1aac>] i915_gem_execbuffer+0x124/0xa48 [i915]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&dev->struct_mutex){--..}:
       [<c04493d2>] __lock_acquire+0xfd7/0x1305
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<c0449749>] lock_acquire+0x49/0x61
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<c066fa32>] mutex_lock_nested+0xeb/0x260
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<f7fa9e53>] drm_vm_open+0x22/0x32 [drm]
       [<c04291bb>] dup_mm+0x26c/0x32a
       [<c0429c42>] copy_process+0x99f/0xfff
       [<c042a3b3>] do_fork+0x111/0x29c
       [<c045ca0d>] audit_filter_syscall+0xda/0xe7
       [<c043376a>] sys_rt_sigaction+0x66/0x79
       [<c04017be>] sys_clone+0x22/0x26
       [<c0402d51>] sysenter_do_call+0x12/0x31
       [<ffffffff>] 0xffffffff

-> #1 (&mm->mmap_sem/1){--..}:
       [<c04493d2>] __lock_acquire+0xfd7/0x1305
       [<c0429000>] dup_mm+0xb1/0x32a
       [<c0429000>] dup_mm+0xb1/0x32a
       [<c0449749>] lock_acquire+0x49/0x61
       [<c0429000>] dup_mm+0xb1/0x32a
       [<c043dd03>] down_write_nested+0x30/0x4a
       [<c0429000>] dup_mm+0xb1/0x32a
       [<c0429000>] dup_mm+0xb1/0x32a
       [<c0670e04>] _spin_unlock_irq+0x20/0x23
       [<c0447c80>] trace_hardirqs_on_caller+0x106/0x126
       [<c0429c42>] copy_process+0x99f/0xfff
       [<c04418ec>] getnstimeofday+0x51/0xd9
       [<c042a3b3>] do_fork+0x111/0x29c
       [<c044456a>] tick_program_event+0x1f/0x23
       [<c043dc02>] hrtimer_interrupt+0x100/0x110
       [<c0402e74>] restore_nocheck_notrace+0x0/0xe
       [<c04017be>] sys_clone+0x22/0x26
       [<c0402d51>] sysenter_do_call+0x12/0x31
       [<ffffffff>] 0xffffffff

-> #0 (&mm->mmap_sem){----}:
       [<c04490f5>] __lock_acquire+0xcfa/0x1305
       [<c0447adc>] mark_held_locks+0x50/0x66
       [<c0670e3b>] _spin_unlock_irqrestore+0x34/0x39
       [<c0447adc>] mark_held_locks+0x50/0x66
       [<c0449749>] lock_acquire+0x49/0x61
       [<c047d508>] might_fault+0x40/0x7c
       [<c047d525>] might_fault+0x5d/0x7c
       [<c047d508>] might_fault+0x40/0x7c
       [<c05105df>] copy_to_user+0x29/0xf8
       [<f7fd230c>] i915_gem_execbuffer+0x984/0xa48 [i915]
       [<c047d508>] might_fault+0x40/0x7c
       [<f7fa553e>] drm_ioctl+0x1a9/0x221 [drm]
       [<f7fd1988>] i915_gem_execbuffer+0x0/0xa48 [i915]
       [<c049cf82>] vfs_ioctl+0x49/0x5f
       [<c049d4e0>] do_vfs_ioctl+0x476/0x4b1
       [<c045c95e>] audit_filter_syscall+0x2b/0xe7
       [<c045ca0d>] audit_filter_syscall+0xda/0xe7
       [<c049d55c>] sys_ioctl+0x41/0x58
       [<c0402d51>] sysenter_do_call+0x12/0x31
       [<ffffffff>] 0xffffffff

other info that might help us debug this:

1 lock held by Xorg/2810:
 #0:  (&dev->struct_mutex){--..}, at: [<f7fd1aac>] i915_gem_execbuffer+0x124/0xa48 [i915]

stack backtrace:
Pid: 2810, comm: Xorg Not tainted 2.6.29-rc1 #1
Call Trace:
 [<c04480c9>] print_circular_bug_tail+0xa6/0xb0
 [<c04490f5>] __lock_acquire+0xcfa/0x1305
 [<c0447adc>] mark_held_locks+0x50/0x66
 [<c0670e3b>] _spin_unlock_irqrestore+0x34/0x39
 [<c0447adc>] mark_held_locks+0x50/0x66
 [<c0449749>] lock_acquire+0x49/0x61
 [<c047d508>] might_fault+0x40/0x7c
 [<c047d525>] might_fault+0x5d/0x7c
 [<c047d508>] might_fault+0x40/0x7c
 [<c05105df>] copy_to_user+0x29/0xf8
 [<f7fd230c>] i915_gem_execbuffer+0x984/0xa48 [i915]
 [<c047d508>] might_fault+0x40/0x7c
 [<f7fa553e>] drm_ioctl+0x1a9/0x221 [drm]
 [<f7fd1988>] i915_gem_execbuffer+0x0/0xa48 [i915]
 [<c049cf82>] vfs_ioctl+0x49/0x5f
 [<c049d4e0>] do_vfs_ioctl+0x476/0x4b1
 [<c045c95e>] audit_filter_syscall+0x2b/0xe7
 [<c045ca0d>] audit_filter_syscall+0xda/0xe7
 [<c049d55c>] sys_ioctl+0x41/0x58
 [<c0402d51>] sysenter_do_call+0x12/0x31




List:       linux-kernel
Subject:    Re: [Bug #12491] i915 lockdep warning
From:       Roland Dreier <rdreier () cisco ! com>
Date:       2009-02-04 22:37:34
Message-ID: adar62dltg1.fsf () cisco ! com
[Download message RAW]

 > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12491
 > Subject		: i915 lockdep warning
 > Submitter	: Brandeburg, Jesse <[email protected]>
 > Date		: 2009-01-13 23:17 (23 days old)
 > References	: http://marc.info/?l=linux-kernel&m=123188898423532&w=4

Looking at the code, it seems that the issue is that the DRM
struct_mutex must be taken inside mmap_sem (because struct_mutex is
taken in drm_vm_open(), which is called with mmap_sem already held), but
i915_gem_execbuffer() does a copy_to_user() while holding struct_mutex,
and if this copy faults, then the VM tries to acquire mmap_sem -- ie
lockdep identifies correctly a potential AB/BA deadlock.

I don't pretend to fully understand the DRM or GEM, but a possible fix
is below -- would be worth it to test and review, and get into 2.6.29 if
it is a correct fix:

---
i915: Fix potential AB-BA deadlock in i915_gem_execbuffer()

Lockdep warns that i915_gem_execbuffer() can trigger a page fault (which
takes mmap_sem) while holding dev->struct_mutex, while drm_vm_open()
(which is called with mmap_sem already held) takes dev->struct_mutex.
So this is a potential AB-BA deadlock.

The way that i915_gem_execbuffer() triggers a page fault is by doing
copy_to_user() when returning new buffer offsets back to userspace;
however there is no reason to hold the struct_mutex when doing this
copy, since what is being copied is a private array anyway.  So we can
fix the potential deadlock (and get rid of the lockdep warning) by
simply moving the copy_to_user() outside of where struct_mutex is held.

This fixes <http://bugzilla.kernel.org/show_bug.cgi?id=12491>.

Reported-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Roland Dreier <[email protected]>
---
 drivers/gpu/drm/i915/i915_gem.c |   21 ++++++++++++---------
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index debad5c..23aad8c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2610,15 +2610,6 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 
 	i915_verify_inactive(dev, __FILE__, __LINE__);
 
-	/* Copy the new buffer offsets back to the user's exec list. */
-	ret = copy_to_user((struct drm_i915_relocation_entry __user *)
-			   (uintptr_t) args->buffers_ptr,
-			   exec_list,
-			   sizeof(*exec_list) * args->buffer_count);
-	if (ret)
-		DRM_ERROR("failed to copy %d exec entries "
-			  "back to user (%d)\n",
-			   args->buffer_count, ret);
 err:
 	for (i = 0; i < pinned; i++)
 		i915_gem_object_unpin(object_list[i]);
@@ -2628,6 +2619,18 @@ err:
 
 	mutex_unlock(&dev->struct_mutex);
 
+	if (!ret) {
+		/* Copy the new buffer offsets back to the user's exec list. */
+		ret = copy_to_user((struct drm_i915_relocation_entry __user *)
+				   (uintptr_t) args->buffers_ptr,
+				   exec_list,
+				   sizeof(*exec_list) * args->buffer_count);
+		if (ret)
+			DRM_ERROR("failed to copy %d exec entries "
+				  "back to user (%d)\n",
+				  args->buffer_count, ret);
+	}
+
 pre_mutex_err:
 	drm_free(object_list, sizeof(*object_list) * args->buffer_count,
 		 DRM_MEM_DRIVER);
--

Reply via email to