Re: [Xen-devel] balloon_mutex lockdep complaint at HVM domain destroy

2016-05-26 Thread Ed Swierk
On Wed, May 25, 2016 at 9:58 AM, David Vrabel  wrote:
> This occurs in dom0?  Or the guest that's being destroyed?

The lockdep warning comes from dom0 when the HVM guest is being destroyed.

> It's a bug but...
>
>> ==
>> [ INFO: RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected ]
>> 4.4.11-grsec #1 Not tainted
>   
> ...this isn't a vanilla kernel?  Can you try vanilla 4.6?

I tried vanilla 4.4.11, and get the same result. I'm having trouble
booting 4.6.0 at all--must be another regression in the early xen boot
code.

> Because:
>
>>IN-RECLAIM_FS-W at:
>>[<__lock_acquire at lockdep.c:2839>] 810becc5
>>[] 810c0ac9
>>[] 816d1b4c
>>[] 
>> 8143c3d4
>>[] 8143c450
>>[<__mmu_notifier_invalidate_page at 
>> mmu_notifier.c:183>] 8119de42
>>[] 
>> 811840c2
>>[] 81185051
>>[] 81185497
>>[] 811599b7
>>[] 
>> 8115a489
>>[] 8115af3a
>>[] 8115b1bb
>>[] 8115c1e4
>>[] 8108eccc
>>[] 816d706e
>
> We should not be reclaiming pages from a gntdev VMA since it's special
> (marked as VM_IO).

Can you suggest any printks for me to add that might help isolate the issue?

--Ed

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] balloon_mutex lockdep complaint at HVM domain destroy

2016-05-25 Thread David Vrabel
On 25/05/16 15:30, Ed Swierk wrote:
> The following lockdep dump occurs whenever I destroy an HVM domain, on
> Linux 4.4 Dom0 with CONFIG_XEN_BALLOON=n on recent stable Xen 4.5.

This occurs in dom0?  Or the guest that's being destroyed?

> Any clues whether this is a real potential deadlock, or how to silence
> it if not?

It's a bug but...

> ==
> [ INFO: RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected ]
> 4.4.11-grsec #1 Not tainted
  
...this isn't a vanilla kernel?  Can you try vanilla 4.6?

Because:

>IN-RECLAIM_FS-W at:
>[<__lock_acquire at lockdep.c:2839>] 810becc5
>[] 810c0ac9
>[] 816d1b4c
>[] 
> 8143c3d4
>[] 8143c450
>[<__mmu_notifier_invalidate_page at 
> mmu_notifier.c:183>] 8119de42
>[] 
> 811840c2
>[] 81185051
>[] 81185497
>[] 811599b7
>[] 
> 8115a489
>[] 8115af3a
>[] 8115b1bb
>[] 8115c1e4
>[] 8108eccc
>[] 816d706e

We should not be reclaiming pages from a gntdev VMA since it's special
(marked as VM_IO).

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] balloon_mutex lockdep complaint at HVM domain destroy

2016-05-25 Thread Ed Swierk
The following lockdep dump occurs whenever I destroy an HVM domain, on
Linux 4.4 Dom0 with CONFIG_XEN_BALLOON=n on recent stable Xen 4.5.

Any clues whether this is a real potential deadlock, or how to silence
it if not?

==
[ INFO: RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected ]
4.4.11-grsec #1 Not tainted
--
qemu-system-i38/3338 [HC0[0]:SC0[0]:HE1:SE1] is trying to acquire:
 (balloon_mutex){+.+.+.}, at: [] 
81430ac3

and this task is already holding:
 (>lock){+.+.-.}, at: [] 8143c77f
which would create a new lock dependency:
 (>lock){+.+.-.} -> (balloon_mutex){+.+.+.}

but this new dependency connects a RECLAIM_FS-irq-safe lock:
 (>lock){+.+.-.}
... which became RECLAIM_FS-irq-safe at:
  [<__lock_acquire at lockdep.c:2839>] 810becc5
  [] 810c0ac9
  [] 816d1b4c
  [] 8143c3d4
  [] 8143c450
  [<__mmu_notifier_invalidate_page at mmu_notifier.c:183>] 8119de42
  [] 811840c2
  [] 81185051
  [] 81185497
  [] 811599b7
  [] 8115a489
  [] 8115af3a
  [] 8115b1bb
  [] 8115c1e4
  [] 8108eccc
  [] 816d706e

to a RECLAIM_FS-irq-unsafe lock:
 (balloon_mutex){+.+.+.}
... which became RECLAIM_FS-irq-unsafe at:
...  [] 810bdd69
  [] 810c12f9
  [<__alloc_pages_nodemask at page_alloc.c:3248>] 8114e0d1
  [] 81199b36
  [] 8143030e
  [] 81430c94
  [] 8142f362
  [] 8143d208
  [] 811da630
  [] 811daa64
  [] 816d6cba

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

   CPU0CPU1
   
  lock(balloon_mutex);
   local_irq_disable();
   lock(>lock);
   lock(balloon_mutex);
  
lock(>lock);

 *** DEADLOCK ***

1 lock held by qemu-system-i38/3338:
 #0:  (>lock){+.+.-.}, at: [] 
8143c77f

the dependencies between RECLAIM_FS-irq-safe lock and the holding lock:
-> (>lock){+.+.-.} ops: 8996 {
   HARDIRQ-ON-W at:
[<__lock_acquire at lockdep.c:2818>] 810bec71
[] 810c0ac9
[] 816d1b4c
[] 8143c3d4
[<__mmu_notifier_invalidate_range_start at 
mmu_notifier.c:197>] 8119d72a
[] 
81172051
[] 81173e55
[] 81175f26
[<__do_page_fault at fault.c:1491>] 8105676f
[] 81056a49
[] 816d87e8
[] 811c4074
[] 
816d6cba
   SOFTIRQ-ON-W at:
[<__lock_acquire at lockdep.c:2822>] 810bec9e
[] 810c0ac9
[] 816d1b4c
[] 8143c3d4
[<__mmu_notifier_invalidate_range_start at 
mmu_notifier.c:197>] 8119d72a
[] 
81172051
[] 81173e55
[] 81175f26
[<__do_page_fault at fault.c:1491>] 8105676f
[] 81056a49
[] 816d87e8
[] 811c4074
[] 
816d6cba
   IN-RECLAIM_FS-W at:
   [<__lock_acquire at lockdep.c:2839>] 810becc5
   [] 810c0ac9
   [] 816d1b4c
   [] 8143c3d4
   [] 8143c450
   [<__mmu_notifier_invalidate_page at mmu_notifier.c:183>] 
8119de42
   [] 
811840c2
   [] 81185051
   [] 81185497
   [] 811599b7
   [] 
8115a489
   [] 8115af3a
   [] 8115b1bb
   [] 8115c1e4
   [] 8108eccc
   [] 816d706e
   INITIAL USE at:
   [<__lock_acquire at lockdep.c:3171>] 810be85c
   [] 810c0ac9
   [] 816d1b4c
   [] 8143c3d4
   [<__mmu_notifier_invalidate_range_start at 
mmu_notifier.c:197>] 8119d72a
   [] 
81172051
   [] 81173e55
   [] 81175f26
   [<__do_page_fault at fault.c:1491>] 8105676f
   [] 81056a49
   [] 816d87e8
   [] 811c4074