[Q] How does linux kernel lockdep record lock-class dependency?

2018-03-15 Thread Du, Changbin

Hello everyone,
I got a warning as below which is a AB-BA deadlock issue. But I don't understand
how the 'existing dependency' happened.

It looks like: kvm_read_guest() held (>mmap_sem), then reading userspace 
memory
(which is not ready yet) caused page_fault() invoked, then in i915_gem_fault()
it tries to hold (>struct_mutex).

But this sequence must haven't happened. Otherwise, double-lock already happed,
since intel_vgpu_create_workload() has held (>struct_mutex) already:

  (>struct_mutex)->(>mmap_sem)->(>struct_mutex)

So how could lockdep find such 'existing dependency'? Thanks!

[  163.179109] ==
[  163.185306] WARNING: possible circular locking dependency detected
[  163.191504] 4.16.0-rc5+ #44 Tainted: G U
[  163.196655] --
[  163.202854] qemu-system-x86/4514 is trying to acquire lock:
[  163.208443]  (>mmap_sem){}, at: [] 
__might_fault+0x36/0x80
[  163.216230]
   but task is already holding lock:
[  163.222090]  (>struct_mutex){+.+.}, at: [] 
copy_gma_to_hva+0xe5/0x140 [i915]
[  163.231205]
   which lock already depends on the new lock.

[  163.239421]
   the existing dependency chain (in reverse order) is:
[  163.246925]
   -> #1 (>struct_mutex){+.+.}:
[  163.252792]i915_mutex_lock_interruptible+0x66/0x170 [i915]
[  163.259005]i915_gem_fault+0x1e0/0x630 [i915]
[  163.263985]__do_fault+0x19/0xed
[  163.267830]__handle_mm_fault+0x9fa/0x1140
[  163.272550]handle_mm_fault+0x1a7/0x390
[  163.277006]__do_page_fault+0x286/0x530
[  163.281462]page_fault+0x45/0x50
[  163.285307]
   -> #0 (>mmap_sem){}:
[  163.290722]__might_fault+0x60/0x80
[  163.294839]__kvm_read_guest_page+0x3d/0x80 [kvm]
[  163.300173]kvm_read_guest+0x47/0x80 [kvm]
[  163.304891]kvmgt_rw_gpa+0x9d/0x110 [kvmgt]
[  163.309714]intel_gvt_scan_and_shadow_workload+0x1be/0x480 [i915]
[  163.316448]intel_vgpu_create_workload+0x3d9/0x550 [i915]
[  163.322488]intel_vgpu_submit_execlist+0xc0/0x2a0 [i915]
[  163.328440]elsp_mmio_write+0xcb/0x140 [i915]
[  163.333448]intel_vgpu_mmio_reg_rw+0x250/0x4f0 [i915]
[  163.339138]intel_vgpu_emulate_mmio_write+0xaa/0x240 [i915]
[  163.345337]intel_vgpu_rw+0x200/0x250 [kvmgt]
[  163.350319]intel_vgpu_write+0x164/0x1f0 [kvmgt]
[  163.38]__vfs_write+0x33/0x170
[  163.359580]vfs_write+0xc5/0x1c0
[  163.363427]SyS_pwrite64+0x90/0xb0
[  163.367447]do_syscall_64+0x70/0x1c0
[  163.371642]entry_SYSCALL_64_after_hwframe+0x42/0xb7
[  163.377230]
   other info that might help us debug this:

[  163.385258]  Possible unsafe locking scenario:

[  163.391196]CPU0CPU1
[  163.395737]
[  163.400280]   lock(>struct_mutex);
[  163.404125]lock(>mmap_sem);
[  163.410062]lock(>struct_mutex);
[  163.416436]   lock(>mmap_sem);
[  163.419846]
*** DEADLOCK ***

[  163.425780] 3 locks held by qemu-system-x86/4514:
[  163.430496]  #0:  (>lock){+.+.}, at: [] 
intel_vgpu_emulate_mmio_write+0x64/0x240 [i915]
[  163.440544]  #1:  (>struct_mutex){+.+.}, at: [] 
copy_gma_to_hva+0xe5/0x140 [i915]
[  163.450068]  #2:  (>srcu){}, at: [] 
kvmgt_rw_gpa+0x4c/0x110 [kvmgt]
[  163.458721]
   stack backtrace:
[  163.463097] CPU: 0 PID: 4514 Comm: qemu-system-x86 Tainted: G U  
 4.16.0-rc5+ #44
[  163.471663] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 
01/26/2016
[  163.479093] Call Trace:
[  163.481547]  dump_stack+0x7c/0xbe
[  163.484872]  print_circular_bug.isra.33+0x21b/0x228
[  163.489765]  __lock_acquire+0xf7d/0x1470
[  163.493700]  ? lock_acquire+0xec/0x1e0
[  163.497459]  lock_acquire+0xec/0x1e0
[  163.501046]  ? __might_fault+0x36/0x80
[  163.504805]  __might_fault+0x60/0x80
[  163.508389]  ? __might_fault+0x36/0x80
[  163.512155]  __kvm_read_guest_page+0x3d/0x80 [kvm]
[  163.516966]  kvm_read_guest+0x47/0x80 [kvm]
[  163.521161]  kvmgt_rw_gpa+0x9d/0x110 [kvmgt]
[  163.525459]  intel_gvt_scan_and_shadow_workload+0x1be/0x480 [i915]
[  163.531675]  intel_vgpu_create_workload+0x3d9/0x550 [i915]
[  163.537192]  intel_vgpu_submit_execlist+0xc0/0x2a0 [i915]
[  163.542621]  elsp_mmio_write+0xcb/0x140 [i915]
[  163.547093]  intel_vgpu_mmio_reg_rw+0x250/0x4f0 [i915]
[  163.552261]  intel_vgpu_emulate_mmio_write+0xaa/0x240 [i915]
[  163.557938]  intel_vgpu_rw+0x200/0x250 [kvmgt]
[  163.562396]  intel_vgpu_write+0x164/0x1f0 [kvmgt]
[  163.567114]  __vfs_write+0x33/0x170
[  163.570614]  ? common_file_perm+0x68/0x250
[  163.574723]  ? security_file_permission+0x36/0xb0
[  163.579440]  

[Q] How does linux kernel lockdep record lock-class dependency?

2018-03-15 Thread Du, Changbin

Hello everyone,
I got a warning as below which is a AB-BA deadlock issue. But I don't understand
how the 'existing dependency' happened.

It looks like: kvm_read_guest() held (>mmap_sem), then reading userspace 
memory
(which is not ready yet) caused page_fault() invoked, then in i915_gem_fault()
it tries to hold (>struct_mutex).

But this sequence must haven't happened. Otherwise, double-lock already happed,
since intel_vgpu_create_workload() has held (>struct_mutex) already:

  (>struct_mutex)->(>mmap_sem)->(>struct_mutex)

So how could lockdep find such 'existing dependency'? Thanks!

[  163.179109] ==
[  163.185306] WARNING: possible circular locking dependency detected
[  163.191504] 4.16.0-rc5+ #44 Tainted: G U
[  163.196655] --
[  163.202854] qemu-system-x86/4514 is trying to acquire lock:
[  163.208443]  (>mmap_sem){}, at: [] 
__might_fault+0x36/0x80
[  163.216230]
   but task is already holding lock:
[  163.222090]  (>struct_mutex){+.+.}, at: [] 
copy_gma_to_hva+0xe5/0x140 [i915]
[  163.231205]
   which lock already depends on the new lock.

[  163.239421]
   the existing dependency chain (in reverse order) is:
[  163.246925]
   -> #1 (>struct_mutex){+.+.}:
[  163.252792]i915_mutex_lock_interruptible+0x66/0x170 [i915]
[  163.259005]i915_gem_fault+0x1e0/0x630 [i915]
[  163.263985]__do_fault+0x19/0xed
[  163.267830]__handle_mm_fault+0x9fa/0x1140
[  163.272550]handle_mm_fault+0x1a7/0x390
[  163.277006]__do_page_fault+0x286/0x530
[  163.281462]page_fault+0x45/0x50
[  163.285307]
   -> #0 (>mmap_sem){}:
[  163.290722]__might_fault+0x60/0x80
[  163.294839]__kvm_read_guest_page+0x3d/0x80 [kvm]
[  163.300173]kvm_read_guest+0x47/0x80 [kvm]
[  163.304891]kvmgt_rw_gpa+0x9d/0x110 [kvmgt]
[  163.309714]intel_gvt_scan_and_shadow_workload+0x1be/0x480 [i915]
[  163.316448]intel_vgpu_create_workload+0x3d9/0x550 [i915]
[  163.322488]intel_vgpu_submit_execlist+0xc0/0x2a0 [i915]
[  163.328440]elsp_mmio_write+0xcb/0x140 [i915]
[  163.333448]intel_vgpu_mmio_reg_rw+0x250/0x4f0 [i915]
[  163.339138]intel_vgpu_emulate_mmio_write+0xaa/0x240 [i915]
[  163.345337]intel_vgpu_rw+0x200/0x250 [kvmgt]
[  163.350319]intel_vgpu_write+0x164/0x1f0 [kvmgt]
[  163.38]__vfs_write+0x33/0x170
[  163.359580]vfs_write+0xc5/0x1c0
[  163.363427]SyS_pwrite64+0x90/0xb0
[  163.367447]do_syscall_64+0x70/0x1c0
[  163.371642]entry_SYSCALL_64_after_hwframe+0x42/0xb7
[  163.377230]
   other info that might help us debug this:

[  163.385258]  Possible unsafe locking scenario:

[  163.391196]CPU0CPU1
[  163.395737]
[  163.400280]   lock(>struct_mutex);
[  163.404125]lock(>mmap_sem);
[  163.410062]lock(>struct_mutex);
[  163.416436]   lock(>mmap_sem);
[  163.419846]
*** DEADLOCK ***

[  163.425780] 3 locks held by qemu-system-x86/4514:
[  163.430496]  #0:  (>lock){+.+.}, at: [] 
intel_vgpu_emulate_mmio_write+0x64/0x240 [i915]
[  163.440544]  #1:  (>struct_mutex){+.+.}, at: [] 
copy_gma_to_hva+0xe5/0x140 [i915]
[  163.450068]  #2:  (>srcu){}, at: [] 
kvmgt_rw_gpa+0x4c/0x110 [kvmgt]
[  163.458721]
   stack backtrace:
[  163.463097] CPU: 0 PID: 4514 Comm: qemu-system-x86 Tainted: G U  
 4.16.0-rc5+ #44
[  163.471663] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 
01/26/2016
[  163.479093] Call Trace:
[  163.481547]  dump_stack+0x7c/0xbe
[  163.484872]  print_circular_bug.isra.33+0x21b/0x228
[  163.489765]  __lock_acquire+0xf7d/0x1470
[  163.493700]  ? lock_acquire+0xec/0x1e0
[  163.497459]  lock_acquire+0xec/0x1e0
[  163.501046]  ? __might_fault+0x36/0x80
[  163.504805]  __might_fault+0x60/0x80
[  163.508389]  ? __might_fault+0x36/0x80
[  163.512155]  __kvm_read_guest_page+0x3d/0x80 [kvm]
[  163.516966]  kvm_read_guest+0x47/0x80 [kvm]
[  163.521161]  kvmgt_rw_gpa+0x9d/0x110 [kvmgt]
[  163.525459]  intel_gvt_scan_and_shadow_workload+0x1be/0x480 [i915]
[  163.531675]  intel_vgpu_create_workload+0x3d9/0x550 [i915]
[  163.537192]  intel_vgpu_submit_execlist+0xc0/0x2a0 [i915]
[  163.542621]  elsp_mmio_write+0xcb/0x140 [i915]
[  163.547093]  intel_vgpu_mmio_reg_rw+0x250/0x4f0 [i915]
[  163.552261]  intel_vgpu_emulate_mmio_write+0xaa/0x240 [i915]
[  163.557938]  intel_vgpu_rw+0x200/0x250 [kvmgt]
[  163.562396]  intel_vgpu_write+0x164/0x1f0 [kvmgt]
[  163.567114]  __vfs_write+0x33/0x170
[  163.570614]  ? common_file_perm+0x68/0x250
[  163.574723]  ? security_file_permission+0x36/0xb0
[  163.579440]