Re: [vma caching] BUG: unable to handle kernel paging request at ffff880008142f40
Hi Davidlohr, On Mon, Mar 10, 2014 at 02:08:08PM -0700, Davidlohr Bueso wrote: > On Mon, 2014-03-10 at 10:43 +0800, Fengguang Wu wrote: > > Hi Davidlohr, > > > > I got the below dmesg and the first bad commit is > > > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > > commit 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 > > Author: Davidlohr Bueso > > AuthorDate: Thu Mar 6 11:01:48 2014 +1100 > > Commit: Stephen Rothwell > > CommitDate: Thu Mar 6 11:01:48 2014 +1100 > > > > mm: per-thread vma caching > > > > hwclock: can't open '/dev/misc/rtc': No such file or directory > > Running postinst /etc/rpm-postinsts/100... > > Hmm this kind of errors strike me as dealing with some bogus vma from a > stale cache. > > > [3.658976] BUG: unable to handle kernel paging request at > > 880008142f40 > > [3.661422] IP: [] vmacache_find+0x78/0x90 > > [3.662223] PGD 2542067 PUD 2543067 PMD fba5067 PTE 88142060 > > [3.662223] Oops: [#1] DEBUG_PAGEALLOC > > [3.662223] Modules linked in: > > [3.662223] CPU: 0 PID: 326 Comm: 90-trinity Not tainted > > 3.14.0-rc5-next-20140307 #1 > > Have you only seen this through DEBUG_PAGEALLOC + trinity? Yes. > > [3.662223] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > [3.662223] task: 8800083020d0 ti: 8800082a task.ti: > > 8800082a > > [3.662223] RIP: 0010:[] [] > > vmacache_find+0x78/0x90 > > [3.662223] RSP: :8800082a1e00 EFLAGS: 00010282 > > [3.662223] RAX: 880008142f40 RBX: 00a9 RCX: > > 8800083020d0 > > [3.662223] RDX: 0002 RSI: 7fff8a141698 RDI: > > 880008124bc0 > > [3.662223] RBP: 8800082a1e00 R08: R09: > > 0001 > > [3.662223] R10: 8800083020d0 R11: R12: > > 7fff8a141698 > > [3.662223] R13: 880008124bc0 R14: 8800082a1f58 R15: > > 8800083020d0 > > [3.662223] FS: 7fe3ca364700() GS:81a06000() > > knlGS: > > [3.662223] CS: 0010 DS: ES: CR0: 80050033 > > [3.662223] CR2: 880008142f40 CR3: 0824c000 CR4: > > 06b0 > > [3.662223] Stack: > > [3.662223] 8800082a1e28 81125219 00a9 > > 7fff8a141698 > > [3.662223] 880008124bc0 8800082a1f28 816d71fe > > 0246 > > [3.662223] 0002 880008124c58 0006 > > 0010 > > [3.662223] Call Trace: > > [3.662223] [] find_vma+0x19/0x70 > > [3.662223] [] __do_page_fault+0x29e/0x560 > > [3.662223] [] ? mntput_no_expire+0x6f/0x1a0 > > [3.662223] [] ? mntput_no_expire+0x11/0x1a0 > > [3.662223] [] ? mntput+0x35/0x40 > > [3.662223] [] ? __fput+0x24f/0x290 > > [3.662223] [] ? trace_hardirqs_off_thunk+0x3a/0x3c > > [3.662223] [] do_page_fault+0xe/0x10 > > [3.662223] [] do_async_page_fault+0x35/0x90 > > [3.662223] [] async_page_fault+0x25/0x30 > > [3.662223] Code: c7 81 b0 02 00 00 00 00 00 00 eb 32 0f 1f 80 00 00 00 > > 00 31 d2 66 0f 1f 44 00 00 48 63 c2 48 8b 84 c1 98 02 00 00 48 85 c0 74 0b > > <48> 39 30 77 06 48 3b 70 08 72 0a 83 c2 01 83 fa 04 75 dd 31 c0 > > [3.662223] RIP [] vmacache_find+0x78/0x90 > > So this is: >0: c7 81 b0 02 00 00 00movl $0x0,0x2b0(%rcx) >7: 00 00 00 >a: eb 32 jmp0x3e >c: 0f 1f 80 00 00 00 00nopl 0x0(%rax) > 13: 31 d2 xor%edx,%edx > 15: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) > 1b: 48 63 c2movslq %edx,%rax > 1e: 48 8b 84 c1 98 02 00mov0x298(%rcx,%rax,8),%rax > 25: 00 > 26: 48 85 c0test %rax,%rax > 29: 74 0b je 0x36 > 2b:* 48 39 30cmp%rsi,(%rax) <-- trapping > instruction > > > which seems to be the following, where vma is stale: > if (vma && vma->vm_start <= addr && vma->vm_end > addr) > return vma; > > > 2e: 77 06 ja 0x36 > 30: 48 3b 70 08 cmp0x8(%rax),%rsi > 34: 72 0a jb 0x40 > 36: 83 c2 01add$0x1,%edx > 39: 83 fa 04cmp$0x4,%edx > 3c: 75 dd jne0x1b > 3e: 31 c0 xor%eax,%eax > > Could you please try this fix: https://lkml.org/lkml/2014/3/10/505 - it > is fix for a race Oleg found that can cause us to keep bogus vmas in the > cache under certain VM_CLONE scenarios. With the patch applied on top of 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 I still get this: sed: /lib/modules/3.14.0-rc5-00226-g0d9ad422-dirty/modules.dep: No such file or directory xargs: modprobe: No such file or directory run-parts: /etc/kernel-tests/01-modprobe exited with code 127 [3.807108] BUG:
Re: [vma caching] BUG: unable to handle kernel paging request at ffff880008142f40
Hi Davidlohr, On Mon, Mar 10, 2014 at 02:08:08PM -0700, Davidlohr Bueso wrote: On Mon, 2014-03-10 at 10:43 +0800, Fengguang Wu wrote: Hi Davidlohr, I got the below dmesg and the first bad commit is git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master commit 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 Author: Davidlohr Bueso davidl...@hp.com AuthorDate: Thu Mar 6 11:01:48 2014 +1100 Commit: Stephen Rothwell s...@canb.auug.org.au CommitDate: Thu Mar 6 11:01:48 2014 +1100 mm: per-thread vma caching hwclock: can't open '/dev/misc/rtc': No such file or directory Running postinst /etc/rpm-postinsts/100... Hmm this kind of errors strike me as dealing with some bogus vma from a stale cache. [3.658976] BUG: unable to handle kernel paging request at 880008142f40 [3.661422] IP: [8111a1d8] vmacache_find+0x78/0x90 [3.662223] PGD 2542067 PUD 2543067 PMD fba5067 PTE 88142060 [3.662223] Oops: [#1] DEBUG_PAGEALLOC [3.662223] Modules linked in: [3.662223] CPU: 0 PID: 326 Comm: 90-trinity Not tainted 3.14.0-rc5-next-20140307 #1 Have you only seen this through DEBUG_PAGEALLOC + trinity? Yes. [3.662223] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [3.662223] task: 8800083020d0 ti: 8800082a task.ti: 8800082a [3.662223] RIP: 0010:[8111a1d8] [8111a1d8] vmacache_find+0x78/0x90 [3.662223] RSP: :8800082a1e00 EFLAGS: 00010282 [3.662223] RAX: 880008142f40 RBX: 00a9 RCX: 8800083020d0 [3.662223] RDX: 0002 RSI: 7fff8a141698 RDI: 880008124bc0 [3.662223] RBP: 8800082a1e00 R08: R09: 0001 [3.662223] R10: 8800083020d0 R11: R12: 7fff8a141698 [3.662223] R13: 880008124bc0 R14: 8800082a1f58 R15: 8800083020d0 [3.662223] FS: 7fe3ca364700() GS:81a06000() knlGS: [3.662223] CS: 0010 DS: ES: CR0: 80050033 [3.662223] CR2: 880008142f40 CR3: 0824c000 CR4: 06b0 [3.662223] Stack: [3.662223] 8800082a1e28 81125219 00a9 7fff8a141698 [3.662223] 880008124bc0 8800082a1f28 816d71fe 0246 [3.662223] 0002 880008124c58 0006 0010 [3.662223] Call Trace: [3.662223] [81125219] find_vma+0x19/0x70 [3.662223] [816d71fe] __do_page_fault+0x29e/0x560 [3.662223] [8116cc6f] ? mntput_no_expire+0x6f/0x1a0 [3.662223] [8116cc11] ? mntput_no_expire+0x11/0x1a0 [3.662223] [8116cdd5] ? mntput+0x35/0x40 [3.662223] [8114f51f] ? __fput+0x24f/0x290 [3.662223] [812794ca] ? trace_hardirqs_off_thunk+0x3a/0x3c [3.662223] [816d74ce] do_page_fault+0xe/0x10 [3.662223] [816d6ad5] do_async_page_fault+0x35/0x90 [3.662223] [816d3b05] async_page_fault+0x25/0x30 [3.662223] Code: c7 81 b0 02 00 00 00 00 00 00 eb 32 0f 1f 80 00 00 00 00 31 d2 66 0f 1f 44 00 00 48 63 c2 48 8b 84 c1 98 02 00 00 48 85 c0 74 0b 48 39 30 77 06 48 3b 70 08 72 0a 83 c2 01 83 fa 04 75 dd 31 c0 [3.662223] RIP [8111a1d8] vmacache_find+0x78/0x90 So this is: 0: c7 81 b0 02 00 00 00movl $0x0,0x2b0(%rcx) 7: 00 00 00 a: eb 32 jmp0x3e c: 0f 1f 80 00 00 00 00nopl 0x0(%rax) 13: 31 d2 xor%edx,%edx 15: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 1b: 48 63 c2movslq %edx,%rax 1e: 48 8b 84 c1 98 02 00mov0x298(%rcx,%rax,8),%rax 25: 00 26: 48 85 c0test %rax,%rax 29: 74 0b je 0x36 2b:* 48 39 30cmp%rsi,(%rax) -- trapping instruction which seems to be the following, where vma is stale: if (vma vma-vm_start = addr vma-vm_end addr) return vma; 2e: 77 06 ja 0x36 30: 48 3b 70 08 cmp0x8(%rax),%rsi 34: 72 0a jb 0x40 36: 83 c2 01add$0x1,%edx 39: 83 fa 04cmp$0x4,%edx 3c: 75 dd jne0x1b 3e: 31 c0 xor%eax,%eax Could you please try this fix: https://lkml.org/lkml/2014/3/10/505 - it is fix for a race Oleg found that can cause us to keep bogus vmas in the cache under certain VM_CLONE scenarios. With the patch applied on top of 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 I still get this: sed: /lib/modules/3.14.0-rc5-00226-g0d9ad422-dirty/modules.dep: No such file or directory xargs: modprobe: No such file or directory
Re: [vma caching] BUG: unable to handle kernel paging request at ffff880008142f40
On Mon, 2014-03-10 at 10:43 +0800, Fengguang Wu wrote: > Hi Davidlohr, > > I got the below dmesg and the first bad commit is > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > commit 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 > Author: Davidlohr Bueso > AuthorDate: Thu Mar 6 11:01:48 2014 +1100 > Commit: Stephen Rothwell > CommitDate: Thu Mar 6 11:01:48 2014 +1100 > > mm: per-thread vma caching > > hwclock: can't open '/dev/misc/rtc': No such file or directory > Running postinst /etc/rpm-postinsts/100... Hmm this kind of errors strike me as dealing with some bogus vma from a stale cache. > [3.658976] BUG: unable to handle kernel paging request at 880008142f40 > [3.661422] IP: [] vmacache_find+0x78/0x90 > [3.662223] PGD 2542067 PUD 2543067 PMD fba5067 PTE 88142060 > [3.662223] Oops: [#1] DEBUG_PAGEALLOC > [3.662223] Modules linked in: > [3.662223] CPU: 0 PID: 326 Comm: 90-trinity Not tainted > 3.14.0-rc5-next-20140307 #1 Have you only seen this through DEBUG_PAGEALLOC + trinity? > [3.662223] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [3.662223] task: 8800083020d0 ti: 8800082a task.ti: > 8800082a > [3.662223] RIP: 0010:[] [] > vmacache_find+0x78/0x90 > [3.662223] RSP: :8800082a1e00 EFLAGS: 00010282 > [3.662223] RAX: 880008142f40 RBX: 00a9 RCX: > 8800083020d0 > [3.662223] RDX: 0002 RSI: 7fff8a141698 RDI: > 880008124bc0 > [3.662223] RBP: 8800082a1e00 R08: R09: > 0001 > [3.662223] R10: 8800083020d0 R11: R12: > 7fff8a141698 > [3.662223] R13: 880008124bc0 R14: 8800082a1f58 R15: > 8800083020d0 > [3.662223] FS: 7fe3ca364700() GS:81a06000() > knlGS: > [3.662223] CS: 0010 DS: ES: CR0: 80050033 > [3.662223] CR2: 880008142f40 CR3: 0824c000 CR4: > 06b0 > [3.662223] Stack: > [3.662223] 8800082a1e28 81125219 00a9 > 7fff8a141698 > [3.662223] 880008124bc0 8800082a1f28 816d71fe > 0246 > [3.662223] 0002 880008124c58 0006 > 0010 > [3.662223] Call Trace: > [3.662223] [] find_vma+0x19/0x70 > [3.662223] [] __do_page_fault+0x29e/0x560 > [3.662223] [] ? mntput_no_expire+0x6f/0x1a0 > [3.662223] [] ? mntput_no_expire+0x11/0x1a0 > [3.662223] [] ? mntput+0x35/0x40 > [3.662223] [] ? __fput+0x24f/0x290 > [3.662223] [] ? trace_hardirqs_off_thunk+0x3a/0x3c > [3.662223] [] do_page_fault+0xe/0x10 > [3.662223] [] do_async_page_fault+0x35/0x90 > [3.662223] [] async_page_fault+0x25/0x30 > [3.662223] Code: c7 81 b0 02 00 00 00 00 00 00 eb 32 0f 1f 80 00 00 00 00 > 31 d2 66 0f 1f 44 00 00 48 63 c2 48 8b 84 c1 98 02 00 00 48 85 c0 74 0b <48> > 39 30 77 06 48 3b 70 08 72 0a 83 c2 01 83 fa 04 75 dd 31 c0 > [3.662223] RIP [] vmacache_find+0x78/0x90 So this is: 0: c7 81 b0 02 00 00 00movl $0x0,0x2b0(%rcx) 7: 00 00 00 a: eb 32 jmp0x3e c: 0f 1f 80 00 00 00 00nopl 0x0(%rax) 13: 31 d2 xor%edx,%edx 15: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 1b: 48 63 c2movslq %edx,%rax 1e: 48 8b 84 c1 98 02 00mov0x298(%rcx,%rax,8),%rax 25: 00 26: 48 85 c0test %rax,%rax 29: 74 0b je 0x36 2b:* 48 39 30cmp%rsi,(%rax) <-- trapping instruction which seems to be the following, where vma is stale: if (vma && vma->vm_start <= addr && vma->vm_end > addr) return vma; 2e: 77 06 ja 0x36 30: 48 3b 70 08 cmp0x8(%rax),%rsi 34: 72 0a jb 0x40 36: 83 c2 01add$0x1,%edx 39: 83 fa 04cmp$0x4,%edx 3c: 75 dd jne0x1b 3e: 31 c0 xor%eax,%eax Could you please try this fix: https://lkml.org/lkml/2014/3/10/505 - it is fix for a race Oleg found that can cause us to keep bogus vmas in the cache under certain VM_CLONE scenarios. Also, how frequently are you able to trigger this? Thanks for the report. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [vma caching] BUG: unable to handle kernel paging request at ffff880008142f40
On Mon, 2014-03-10 at 10:43 +0800, Fengguang Wu wrote: Hi Davidlohr, I got the below dmesg and the first bad commit is git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master commit 0d9ad4220e6d73f63a9eeeaac031b92838f75bb3 Author: Davidlohr Bueso davidl...@hp.com AuthorDate: Thu Mar 6 11:01:48 2014 +1100 Commit: Stephen Rothwell s...@canb.auug.org.au CommitDate: Thu Mar 6 11:01:48 2014 +1100 mm: per-thread vma caching hwclock: can't open '/dev/misc/rtc': No such file or directory Running postinst /etc/rpm-postinsts/100... Hmm this kind of errors strike me as dealing with some bogus vma from a stale cache. [3.658976] BUG: unable to handle kernel paging request at 880008142f40 [3.661422] IP: [8111a1d8] vmacache_find+0x78/0x90 [3.662223] PGD 2542067 PUD 2543067 PMD fba5067 PTE 88142060 [3.662223] Oops: [#1] DEBUG_PAGEALLOC [3.662223] Modules linked in: [3.662223] CPU: 0 PID: 326 Comm: 90-trinity Not tainted 3.14.0-rc5-next-20140307 #1 Have you only seen this through DEBUG_PAGEALLOC + trinity? [3.662223] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [3.662223] task: 8800083020d0 ti: 8800082a task.ti: 8800082a [3.662223] RIP: 0010:[8111a1d8] [8111a1d8] vmacache_find+0x78/0x90 [3.662223] RSP: :8800082a1e00 EFLAGS: 00010282 [3.662223] RAX: 880008142f40 RBX: 00a9 RCX: 8800083020d0 [3.662223] RDX: 0002 RSI: 7fff8a141698 RDI: 880008124bc0 [3.662223] RBP: 8800082a1e00 R08: R09: 0001 [3.662223] R10: 8800083020d0 R11: R12: 7fff8a141698 [3.662223] R13: 880008124bc0 R14: 8800082a1f58 R15: 8800083020d0 [3.662223] FS: 7fe3ca364700() GS:81a06000() knlGS: [3.662223] CS: 0010 DS: ES: CR0: 80050033 [3.662223] CR2: 880008142f40 CR3: 0824c000 CR4: 06b0 [3.662223] Stack: [3.662223] 8800082a1e28 81125219 00a9 7fff8a141698 [3.662223] 880008124bc0 8800082a1f28 816d71fe 0246 [3.662223] 0002 880008124c58 0006 0010 [3.662223] Call Trace: [3.662223] [81125219] find_vma+0x19/0x70 [3.662223] [816d71fe] __do_page_fault+0x29e/0x560 [3.662223] [8116cc6f] ? mntput_no_expire+0x6f/0x1a0 [3.662223] [8116cc11] ? mntput_no_expire+0x11/0x1a0 [3.662223] [8116cdd5] ? mntput+0x35/0x40 [3.662223] [8114f51f] ? __fput+0x24f/0x290 [3.662223] [812794ca] ? trace_hardirqs_off_thunk+0x3a/0x3c [3.662223] [816d74ce] do_page_fault+0xe/0x10 [3.662223] [816d6ad5] do_async_page_fault+0x35/0x90 [3.662223] [816d3b05] async_page_fault+0x25/0x30 [3.662223] Code: c7 81 b0 02 00 00 00 00 00 00 eb 32 0f 1f 80 00 00 00 00 31 d2 66 0f 1f 44 00 00 48 63 c2 48 8b 84 c1 98 02 00 00 48 85 c0 74 0b 48 39 30 77 06 48 3b 70 08 72 0a 83 c2 01 83 fa 04 75 dd 31 c0 [3.662223] RIP [8111a1d8] vmacache_find+0x78/0x90 So this is: 0: c7 81 b0 02 00 00 00movl $0x0,0x2b0(%rcx) 7: 00 00 00 a: eb 32 jmp0x3e c: 0f 1f 80 00 00 00 00nopl 0x0(%rax) 13: 31 d2 xor%edx,%edx 15: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 1b: 48 63 c2movslq %edx,%rax 1e: 48 8b 84 c1 98 02 00mov0x298(%rcx,%rax,8),%rax 25: 00 26: 48 85 c0test %rax,%rax 29: 74 0b je 0x36 2b:* 48 39 30cmp%rsi,(%rax) -- trapping instruction which seems to be the following, where vma is stale: if (vma vma-vm_start = addr vma-vm_end addr) return vma; 2e: 77 06 ja 0x36 30: 48 3b 70 08 cmp0x8(%rax),%rsi 34: 72 0a jb 0x40 36: 83 c2 01add$0x1,%edx 39: 83 fa 04cmp$0x4,%edx 3c: 75 dd jne0x1b 3e: 31 c0 xor%eax,%eax Could you please try this fix: https://lkml.org/lkml/2014/3/10/505 - it is fix for a race Oleg found that can cause us to keep bogus vmas in the cache under certain VM_CLONE scenarios. Also, how frequently are you able to trigger this? Thanks for the report. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/