Re: oops in kernel ( 3.4.x -> 3.5rc )
Hello last git master checkout seems correct this problem Thanks Nicolas Prochazka. 2012/7/28 nicolas prochazka : > hello again, > bisect git gives ( after 13 steps) : > 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit > > Regards, > Nicolas Prochazka > > 2012/7/24 Thadeu Lima de Souza Cascardo : >> On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: >>> Hello, >>> I 'm trying differents versions with differents results >>> >>> - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not >>> reproductible >>> - 3.5rc7 : bug is reproductible(cf new dump ) >>> - master branch : bug is reproductible >>> >>> >>> >>> Regards, >>> Nicolas Prochazka >>> >>> >> >> Hi, Nicolas. >> >> Can you try a bisect? I am also copying David Howells and Bobby Powers, >> even though their set of patches don't seem to be the culprit. >> >> Regards. >> Cascardo. >> >>> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging >>> request at 0001002f >>> Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0 >>> Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 >>> Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP >>> Jul 23 21:10:05 positronic18696 CPU 1 >>> Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm >>> Jul 23 21:10:05 positronic18696 >>> Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted >>> 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 >>> Jul 23 21:10:05 positronic18696 RIP: 0010:[] >>> [] dup_fd+0x160/0x2e0 >>> Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 >>> Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: >>> 8807ed95eec0 RCX: 0007 >>> Jul 23 21:10:05 positronic18696 RDX: RSI: >>> 0800 RDI: 8805a4d01d40 >>> Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: >>> 0020 R09: 81156694 >>> Jul 23 21:10:05 positronic18696 R10: 0001 R11: >>> R12: 8807ecf25000 >>> Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: >>> 0100 R15: 8807d3a61800 >>> Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() >>> GS:88083fc4() knlGS: >>> Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: >>> 8005003b >>> Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: >>> 00066ef9 CR4: 27e0 >>> Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: >>> 0002 DR2: 0001 >>> Jul 23 21:10:05 positronic18696 DR3: 000a DR6: >>> 0ff0 DR7: 0400 >>> Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, >>> threadinfo 880669ebc000, task 8806625ab000) >>> Jul 23 21:10:05 positronic18696 Stack: >>> Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 >>> 0020 8806627ee8c0 >>> Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 >>> 7f1a8e7199d0 8807bf41d000 >>> Jul 23 21:10:05 positronic18696 01200011 >>> 7f1a8e7199d0 >>> Jul 23 21:10:05 positronic18696 Call Trace: >>> Jul 23 21:10:05 positronic18696 [] >>> copy_process+0x931/0x13c0 >>> Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360 >>> Jul 23 21:10:05 positronic18696 [] ? >>> _raw_spin_lock+0xe/0x20 >>> Jul 23 21:10:05 positronic18696 [] ? >>> __set_task_blocked+0x37/0x80 >>> Jul 23 21:10:05 positronic18696 [] ? >>> __set_current_blocked+0x53/0x70 >>> Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30 >>> Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20 >>> Jul 23 21:10:05 positronic18696 [] ? >>> system_call_fastpath+0x16/0x1b >>> Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 >>> 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 >>> 08 00 00 00 31 c0 eb 15 90 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 >>> 08 48 39 f0 74 24 49 >>> Jul 23 21:10:05 positronic18696 RIP [] dup_fd+0x160/0x2e0 >>> Jul 23 21:10:05 positronic18696 RSP >>> Jul 23 21:10:05 positronic18696 CR2: 0001002f >>> Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- >>> Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode >>> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging >>> request at 0001003b >>> Jul 23 21:10:05 positronic18696 IP: [] >>> tid_fd_revalidate+0x84/0x1a0 >>> Jul
Re: oops in kernel ( 3.4.x - 3.5rc )
Hello last git master checkout seems correct this problem Thanks Nicolas Prochazka. 2012/7/28 nicolas prochazka prochazka.nico...@gmail.com: hello again, bisect git gives ( after 13 steps) : 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit Regards, Nicolas Prochazka 2012/7/24 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com: On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: Hello, I 'm trying differents versions with differents results - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible - 3.5rc7 : bug is reproductible(cf new dump ) - master branch : bug is reproductible Regards, Nicolas Prochazka Hi, Nicolas. Can you try a bisect? I am also copying David Howells and Bobby Powers, even though their set of patches don't seem to be the culprit. Regards. Cascardo. Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001002f Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP Jul 23 21:10:05 positronic18696 CPU 1 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900] [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: 8807ed95eec0 RCX: 0007 Jul 23 21:10:05 positronic18696 RDX: RSI: 0800 RDI: 8805a4d01d40 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: 0020 R09: 81156694 Jul 23 21:10:05 positronic18696 R10: 0001 R11: R12: 8807ecf25000 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: 0100 R15: 8807d3a61800 Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() GS:88083fc4() knlGS: Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: 8005003b Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: 00066ef9 CR4: 27e0 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: 0002 DR2: 0001 Jul 23 21:10:05 positronic18696 DR3: 000a DR6: 0ff0 DR7: 0400 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, threadinfo 880669ebc000, task 8806625ab000) Jul 23 21:10:05 positronic18696 Stack: Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 0020 8806627ee8c0 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 7f1a8e7199d0 8807bf41d000 Jul 23 21:10:05 positronic18696 01200011 7f1a8e7199d0 Jul 23 21:10:05 positronic18696 Call Trace: Jul 23 21:10:05 positronic18696 [81040441] copy_process+0x931/0x13c0 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? _raw_spin_lock+0xe/0x20 Jul 23 21:10:05 positronic18696 [810559e7] ? __set_task_blocked+0x37/0x80 Jul 23 21:10:05 positronic18696 [81055a83] ? __set_current_blocked+0x53/0x70 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20 Jul 23 21:10:05 positronic18696 [81ac4929] ? system_call_fastpath+0x16/0x1b Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74 24 49 Jul 23 21:10:05 positronic18696 RIP [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90 Jul 23 21:10:05 positronic18696 CR2: 0001002f Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001003b Jul 23 21:10:05 positronic18696 IP: [8119f684] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05
Re: oops in kernel ( 3.4.x -> 3.5rc )
hello again, bisect git gives ( after 13 steps) : 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit Regards, Nicolas Prochazka 2012/7/24 Thadeu Lima de Souza Cascardo : > On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: >> Hello, >> I 'm trying differents versions with differents results >> >> - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible >> - 3.5rc7 : bug is reproductible(cf new dump ) >> - master branch : bug is reproductible >> >> >> >> Regards, >> Nicolas Prochazka >> >> > > Hi, Nicolas. > > Can you try a bisect? I am also copying David Howells and Bobby Powers, > even though their set of patches don't seem to be the culprit. > > Regards. > Cascardo. > >> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging >> request at 0001002f >> Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0 >> Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 >> Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP >> Jul 23 21:10:05 positronic18696 CPU 1 >> Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm >> Jul 23 21:10:05 positronic18696 >> Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted >> 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 >> Jul 23 21:10:05 positronic18696 RIP: 0010:[] >> [] dup_fd+0x160/0x2e0 >> Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 >> Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: >> 8807ed95eec0 RCX: 0007 >> Jul 23 21:10:05 positronic18696 RDX: RSI: >> 0800 RDI: 8805a4d01d40 >> Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: >> 0020 R09: 81156694 >> Jul 23 21:10:05 positronic18696 R10: 0001 R11: >> R12: 8807ecf25000 >> Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: >> 0100 R15: 8807d3a61800 >> Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() >> GS:88083fc4() knlGS: >> Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: >> 8005003b >> Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: >> 00066ef9 CR4: 27e0 >> Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: >> 0002 DR2: 0001 >> Jul 23 21:10:05 positronic18696 DR3: 000a DR6: >> 0ff0 DR7: 0400 >> Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, >> threadinfo 880669ebc000, task 8806625ab000) >> Jul 23 21:10:05 positronic18696 Stack: >> Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 >> 0020 8806627ee8c0 >> Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 >> 7f1a8e7199d0 8807bf41d000 >> Jul 23 21:10:05 positronic18696 01200011 >> 7f1a8e7199d0 >> Jul 23 21:10:05 positronic18696 Call Trace: >> Jul 23 21:10:05 positronic18696 [] >> copy_process+0x931/0x13c0 >> Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360 >> Jul 23 21:10:05 positronic18696 [] ? >> _raw_spin_lock+0xe/0x20 >> Jul 23 21:10:05 positronic18696 [] ? >> __set_task_blocked+0x37/0x80 >> Jul 23 21:10:05 positronic18696 [] ? >> __set_current_blocked+0x53/0x70 >> Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30 >> Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20 >> Jul 23 21:10:05 positronic18696 [] ? >> system_call_fastpath+0x16/0x1b >> Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 >> 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 >> 08 00 00 00 31 c0 eb 15 90 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 >> 08 48 39 f0 74 24 49 >> Jul 23 21:10:05 positronic18696 RIP [] dup_fd+0x160/0x2e0 >> Jul 23 21:10:05 positronic18696 RSP >> Jul 23 21:10:05 positronic18696 CR2: 0001002f >> Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- >> Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode >> Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode >> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging >> request at 0001003b >> Jul 23 21:10:05 positronic18696 IP: [] >> tid_fd_revalidate+0x84/0x1a0 >> Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 >> Jul 23 21:10:05 positronic18696 Oops: [#2] SMP >> Jul 23 21:10:05 positronic18696 CPU 0 >> Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm >> Jul 23 21:10:05
Re: oops in kernel ( 3.4.x - 3.5rc )
hello again, bisect git gives ( after 13 steps) : 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit Regards, Nicolas Prochazka 2012/7/24 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com: On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: Hello, I 'm trying differents versions with differents results - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible - 3.5rc7 : bug is reproductible(cf new dump ) - master branch : bug is reproductible Regards, Nicolas Prochazka Hi, Nicolas. Can you try a bisect? I am also copying David Howells and Bobby Powers, even though their set of patches don't seem to be the culprit. Regards. Cascardo. Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001002f Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP Jul 23 21:10:05 positronic18696 CPU 1 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900] [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: 8807ed95eec0 RCX: 0007 Jul 23 21:10:05 positronic18696 RDX: RSI: 0800 RDI: 8805a4d01d40 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: 0020 R09: 81156694 Jul 23 21:10:05 positronic18696 R10: 0001 R11: R12: 8807ecf25000 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: 0100 R15: 8807d3a61800 Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() GS:88083fc4() knlGS: Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: 8005003b Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: 00066ef9 CR4: 27e0 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: 0002 DR2: 0001 Jul 23 21:10:05 positronic18696 DR3: 000a DR6: 0ff0 DR7: 0400 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, threadinfo 880669ebc000, task 8806625ab000) Jul 23 21:10:05 positronic18696 Stack: Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 0020 8806627ee8c0 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 7f1a8e7199d0 8807bf41d000 Jul 23 21:10:05 positronic18696 01200011 7f1a8e7199d0 Jul 23 21:10:05 positronic18696 Call Trace: Jul 23 21:10:05 positronic18696 [81040441] copy_process+0x931/0x13c0 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? _raw_spin_lock+0xe/0x20 Jul 23 21:10:05 positronic18696 [810559e7] ? __set_task_blocked+0x37/0x80 Jul 23 21:10:05 positronic18696 [81055a83] ? __set_current_blocked+0x53/0x70 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20 Jul 23 21:10:05 positronic18696 [81ac4929] ? system_call_fastpath+0x16/0x1b Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74 24 49 Jul 23 21:10:05 positronic18696 RIP [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90 Jul 23 21:10:05 positronic18696 CR2: 0001002f Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001003b Jul 23 21:10:05 positronic18696 IP: [8119f684] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: [#2] SMP Jul 23 21:10:05 positronic18696 CPU 0 Jul 23 21:10:05 positronic18696
Re: oops in kernel ( 3.4.x -> 3.5rc )
On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: > Hello, > I 'm trying differents versions with differents results > > - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible > - 3.5rc7 : bug is reproductible(cf new dump ) > - master branch : bug is reproductible > > > > Regards, > Nicolas Prochazka > > Hi, Nicolas. Can you try a bisect? I am also copying David Howells and Bobby Powers, even though their set of patches don't seem to be the culprit. Regards. Cascardo. > Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging > request at 0001002f > Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0 > Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 > Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP > Jul 23 21:10:05 positronic18696 CPU 1 > Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm > Jul 23 21:10:05 positronic18696 > Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted > 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 > Jul 23 21:10:05 positronic18696 RIP: 0010:[] > [] dup_fd+0x160/0x2e0 > Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 > Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: > 8807ed95eec0 RCX: 0007 > Jul 23 21:10:05 positronic18696 RDX: RSI: > 0800 RDI: 8805a4d01d40 > Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: > 0020 R09: 81156694 > Jul 23 21:10:05 positronic18696 R10: 0001 R11: > R12: 8807ecf25000 > Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: > 0100 R15: 8807d3a61800 > Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() > GS:88083fc4() knlGS: > Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: > 8005003b > Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: > 00066ef9 CR4: 27e0 > Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: > 0002 DR2: 0001 > Jul 23 21:10:05 positronic18696 DR3: 000a DR6: > 0ff0 DR7: 0400 > Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, > threadinfo 880669ebc000, task 8806625ab000) > Jul 23 21:10:05 positronic18696 Stack: > Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 > 0020 8806627ee8c0 > Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 > 7f1a8e7199d0 8807bf41d000 > Jul 23 21:10:05 positronic18696 01200011 > 7f1a8e7199d0 > Jul 23 21:10:05 positronic18696 Call Trace: > Jul 23 21:10:05 positronic18696 [] copy_process+0x931/0x13c0 > Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360 > Jul 23 21:10:05 positronic18696 [] ? _raw_spin_lock+0xe/0x20 > Jul 23 21:10:05 positronic18696 [] ? > __set_task_blocked+0x37/0x80 > Jul 23 21:10:05 positronic18696 [] ? > __set_current_blocked+0x53/0x70 > Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30 > Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20 > Jul 23 21:10:05 positronic18696 [] ? > system_call_fastpath+0x16/0x1b > Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 > 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 > 08 00 00 00 31 c0 eb 15 90 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 > 08 48 39 f0 74 24 49 > Jul 23 21:10:05 positronic18696 RIP [] dup_fd+0x160/0x2e0 > Jul 23 21:10:05 positronic18696 RSP > Jul 23 21:10:05 positronic18696 CR2: 0001002f > Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- > Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode > Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode > Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging > request at 0001003b > Jul 23 21:10:05 positronic18696 IP: [] > tid_fd_revalidate+0x84/0x1a0 > Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 > Jul 23 21:10:05 positronic18696 Oops: [#2] SMP > Jul 23 21:10:05 positronic18696 CPU 0 > Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm > Jul 23 21:10:05 positronic18696 > Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G >D 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 > Jul 23 21:10:05 positronic18696 RIP: 0010:[] > [] tid_fd_revalidate+0x84/0x1a0 > Jul 23 21:10:05 positronic18696 RSP: 0018:88053f965d78 EFLAGS:
Re: oops in kernel ( 3.4.x - 3.5rc )
On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote: Hello, I 'm trying differents versions with differents results - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible - 3.5rc7 : bug is reproductible(cf new dump ) - master branch : bug is reproductible Regards, Nicolas Prochazka Hi, Nicolas. Can you try a bisect? I am also copying David Howells and Bobby Powers, even though their set of patches don't seem to be the culprit. Regards. Cascardo. Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001002f Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP Jul 23 21:10:05 positronic18696 CPU 1 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900] [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: 8807ed95eec0 RCX: 0007 Jul 23 21:10:05 positronic18696 RDX: RSI: 0800 RDI: 8805a4d01d40 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: 0020 R09: 81156694 Jul 23 21:10:05 positronic18696 R10: 0001 R11: R12: 8807ecf25000 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: 0100 R15: 8807d3a61800 Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() GS:88083fc4() knlGS: Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: 8005003b Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: 00066ef9 CR4: 27e0 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: 0002 DR2: 0001 Jul 23 21:10:05 positronic18696 DR3: 000a DR6: 0ff0 DR7: 0400 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, threadinfo 880669ebc000, task 8806625ab000) Jul 23 21:10:05 positronic18696 Stack: Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 0020 8806627ee8c0 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 7f1a8e7199d0 8807bf41d000 Jul 23 21:10:05 positronic18696 01200011 7f1a8e7199d0 Jul 23 21:10:05 positronic18696 Call Trace: Jul 23 21:10:05 positronic18696 [81040441] copy_process+0x931/0x13c0 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? _raw_spin_lock+0xe/0x20 Jul 23 21:10:05 positronic18696 [810559e7] ? __set_task_blocked+0x37/0x80 Jul 23 21:10:05 positronic18696 [81055a83] ? __set_current_blocked+0x53/0x70 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20 Jul 23 21:10:05 positronic18696 [81ac4929] ? system_call_fastpath+0x16/0x1b Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74 24 49 Jul 23 21:10:05 positronic18696 RIP [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90 Jul 23 21:10:05 positronic18696 CR2: 0001002f Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001003b Jul 23 21:10:05 positronic18696 IP: [8119f684] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: [#2] SMP Jul 23 21:10:05 positronic18696 CPU 0 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G D 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05
Re: oops in kernel ( 3.4.x -> 3.5rc )
Hello, I 'm trying differents versions with differents results - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible - 3.5rc7 : bug is reproductible(cf new dump ) - master branch : bug is reproductible Regards, Nicolas Prochazka Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001002f Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP Jul 23 21:10:05 positronic18696 CPU 1 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[] [] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: 8807ed95eec0 RCX: 0007 Jul 23 21:10:05 positronic18696 RDX: RSI: 0800 RDI: 8805a4d01d40 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: 0020 R09: 81156694 Jul 23 21:10:05 positronic18696 R10: 0001 R11: R12: 8807ecf25000 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: 0100 R15: 8807d3a61800 Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() GS:88083fc4() knlGS: Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: 8005003b Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: 00066ef9 CR4: 27e0 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: 0002 DR2: 0001 Jul 23 21:10:05 positronic18696 DR3: 000a DR6: 0ff0 DR7: 0400 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, threadinfo 880669ebc000, task 8806625ab000) Jul 23 21:10:05 positronic18696 Stack: Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 0020 8806627ee8c0 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 7f1a8e7199d0 8807bf41d000 Jul 23 21:10:05 positronic18696 01200011 7f1a8e7199d0 Jul 23 21:10:05 positronic18696 Call Trace: Jul 23 21:10:05 positronic18696 [] copy_process+0x931/0x13c0 Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360 Jul 23 21:10:05 positronic18696 [] ? _raw_spin_lock+0xe/0x20 Jul 23 21:10:05 positronic18696 [] ? __set_task_blocked+0x37/0x80 Jul 23 21:10:05 positronic18696 [] ? __set_current_blocked+0x53/0x70 Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30 Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20 Jul 23 21:10:05 positronic18696 [] ? system_call_fastpath+0x16/0x1b Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31 c0 eb 15 90 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74 24 49 Jul 23 21:10:05 positronic18696 RIP [] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP Jul 23 21:10:05 positronic18696 CR2: 0001002f Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001003b Jul 23 21:10:05 positronic18696 IP: [] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: [#2] SMP Jul 23 21:10:05 positronic18696 CPU 0 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G D 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[] [] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 RSP: 0018:88053f965d78 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 8807ee3bf700 RBX: 8807ed1d9118 RCX: 007e Jul 23 21:10:05 positronic18696 RDX: RSI: RDI: 8807ee3bf700 Jul 23 21:10:05 positronic18696 RBP: 88053f965d98 R08: 88083fc16a10 R09: 8119bfe0 Jul 23 21:10:05 positronic18696 R10: R11: 0206 R12: 8807d8afaa80 Jul 23 21:10:05
Re: oops in kernel ( 3.4.x - 3.5rc )
Hello, I 'm trying differents versions with differents results - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215 : bug is not reproductible - 3.5rc7 : bug is reproductible(cf new dump ) - master branch : bug is reproductible Regards, Nicolas Prochazka Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001002f Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP Jul 23 21:10:05 positronic18696 CPU 1 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900] [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX: 8807ed95eec0 RCX: 0007 Jul 23 21:10:05 positronic18696 RDX: RSI: 0800 RDI: 8805a4d01d40 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08: 0020 R09: 81156694 Jul 23 21:10:05 positronic18696 R10: 0001 R11: R12: 8807ecf25000 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14: 0100 R15: 8807d3a61800 Jul 23 21:10:05 positronic18696 FS: 7f1a8e719700() GS:88083fc4() knlGS: Jul 23 21:10:05 positronic18696 CS: 0010 DS: ES: CR0: 8005003b Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3: 00066ef9 CR4: 27e0 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1: 0002 DR2: 0001 Jul 23 21:10:05 positronic18696 DR3: 000a DR6: 0ff0 DR7: 0400 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596, threadinfo 880669ebc000, task 8806625ab000) Jul 23 21:10:05 positronic18696 Stack: Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49 0020 8806627ee8c0 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680 7f1a8e7199d0 8807bf41d000 Jul 23 21:10:05 positronic18696 01200011 7f1a8e7199d0 Jul 23 21:10:05 positronic18696 Call Trace: Jul 23 21:10:05 positronic18696 [81040441] copy_process+0x931/0x13c0 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? _raw_spin_lock+0xe/0x20 Jul 23 21:10:05 positronic18696 [810559e7] ? __set_task_blocked+0x37/0x80 Jul 23 21:10:05 positronic18696 [81055a83] ? __set_current_blocked+0x53/0x70 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20 Jul 23 21:10:05 positronic18696 [81ac4929] ? system_call_fastpath+0x16/0x1b Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74 24 49 Jul 23 21:10:05 positronic18696 RIP [81156900] dup_fd+0x160/0x2e0 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90 Jul 23 21:10:05 positronic18696 CR2: 0001002f Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]--- Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging request at 0001003b Jul 23 21:10:05 positronic18696 IP: [8119f684] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0 Jul 23 21:10:05 positronic18696 Oops: [#2] SMP Jul 23 21:10:05 positronic18696 CPU 0 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm Jul 23 21:10:05 positronic18696 Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G D 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736 Jul 23 21:10:05 positronic18696 RIP: 0010:[8119f684] [8119f684] tid_fd_revalidate+0x84/0x1a0 Jul 23 21:10:05 positronic18696 RSP: 0018:88053f965d78 EFLAGS: 00010206 Jul 23 21:10:05 positronic18696 RAX: 8807ee3bf700 RBX: 8807ed1d9118 RCX: 007e Jul 23 21:10:05 positronic18696 RDX: RSI:
Re: oops in kernel ( 3.4.x -> 3.5rc )
Well done 1fd36adcd98c14d2fd97f545293c488775cb2823 : the bug occurs ( cf dump ) 1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 : the bug not occurs Regards, Nicolas Prochazka. dump / 1fd36adcd98c14d2fd97f545293c488775cb2823 lloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! VMtap: no IPv6 routers present alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 96 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 110 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! brE: no IPv6 routers present alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 70 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 102 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 106 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 106 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! [ cut here ] kernel BUG at fs/open.c:873! invalid opcode: [#1] SMP CPU 0 Modules linked in: kvm_intel kvm then BUG paging request as usual 2012/7/20 Thadeu Lima de Souza Cascardo : > On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote: >> Hello >> the problem is occured with : >> - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, >> but i can if you want >> - linux kernel 3.5rc6 rc7 / do not test with other rc. >> >> the problem is not occured with : >> linux kernel 3.3.4 / 3.3.8 >> >> These servers are used for : >> - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of >> select i think) >> - do a lot of network tests with openvswitch >> >> I can test a kernel 3.4.x before and after a commit id (?) to find a >> regression.
Re: oops in kernel ( 3.4.x -> 3.5rc )
On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote: > Hello > the problem is occured with : > - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, > but i can if you want > - linux kernel 3.5rc6 rc7 / do not test with other rc. > > the problem is not occured with : > linux kernel 3.3.4 / 3.3.8 > > These servers are used for : > - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of > select i think) > - do a lot of network tests with openvswitch > > I can test a kernel 3.4.x before and after a commit id (?) to find a > regression. > > Regards, > Nicolas. > Can you try this commit 1fd36adcd98c14d2fd97f545293c488775cb2823? And the commit before it? > > 2012/7/20 Thadeu Lima de Souza Cascardo : > > On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: > >> On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: > >> > >> > [ 2384.900061] BUG: unable to handle kernel paging request at > >> 0001002f > >> > >> That '1' looks like a random bit flip. Try running memtest86. > >> > > > > Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most > > likely a pointer to some other piece of a struct. However, taking a look > > at fs/files.c code, nothing seems suspicious. > > > > Nicolas, it wasn't clear to me if you had problems with 3.4 too. There > > has been some changes in fs/files.c on 3.4-rc1 in the piece of code > > where you hit the problem. > > > > What does your system exercise? Any chance you are using a lot of > > select, which has also been changed in those same patches to fs/files.c? > > > > Regards. > > Cascardo. > > > > > >> > [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W > >> > >> This wasn't the first problem either. > >> > >> > [ 2397.885344] BUG: unable to handle kernel paging request at > >> 0001003b > >> > >> Looks like the same flipped bit. > >> > >> Dave > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >> the body of a message to majord...@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> Please read the FAQ at http://www.tux.org/lkml/ > >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x -> 3.5rc )
Hello the problem is occured with : - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, but i can if you want - linux kernel 3.5rc6 rc7 / do not test with other rc. the problem is not occured with : linux kernel 3.3.4 / 3.3.8 These servers are used for : - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of select i think) - do a lot of network tests with openvswitch I can test a kernel 3.4.x before and after a commit id (?) to find a regression. Regards, Nicolas. 2012/7/20 Thadeu Lima de Souza Cascardo : > On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: >> On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: >> >> > [ 2384.900061] BUG: unable to handle kernel paging request at >> 0001002f >> >> That '1' looks like a random bit flip. Try running memtest86. >> > > Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most > likely a pointer to some other piece of a struct. However, taking a look > at fs/files.c code, nothing seems suspicious. > > Nicolas, it wasn't clear to me if you had problems with 3.4 too. There > has been some changes in fs/files.c on 3.4-rc1 in the piece of code > where you hit the problem. > > What does your system exercise? Any chance you are using a lot of > select, which has also been changed in those same patches to fs/files.c? > > Regards. > Cascardo. > > >> > [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W >> >> This wasn't the first problem either. >> >> > [ 2397.885344] BUG: unable to handle kernel paging request at >> 0001003b >> >> Looks like the same flipped bit. >> >> Dave >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x -> 3.5rc )
On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: > On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: > > > [ 2384.900061] BUG: unable to handle kernel paging request at > 0001002f > > That '1' looks like a random bit flip. Try running memtest86. > Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most likely a pointer to some other piece of a struct. However, taking a look at fs/files.c code, nothing seems suspicious. Nicolas, it wasn't clear to me if you had problems with 3.4 too. There has been some changes in fs/files.c on 3.4-rc1 in the piece of code where you hit the problem. What does your system exercise? Any chance you are using a lot of select, which has also been changed in those same patches to fs/files.c? Regards. Cascardo. > > [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W > > This wasn't the first problem either. > > > [ 2397.885344] BUG: unable to handle kernel paging request at > 0001003b > > Looks like the same flipped bit. > > Dave > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x -> 3.5rc )
Hello; I can reproduce this problem on five differents servers, I can try a memtest86. regards, Nicolas Prochazka. complete dump : 596.322369] BUG: unable to handle kernel paging request at 0001003b [ 596.322622] IP: [] tid_fd_revalidate+0x84/0x1a0 [ 596.322828] PGD 7d6c20067 PUD 0 [ 596.322972] Oops: [#1] SMP [ 596.323115] CPU 3 [ 596.323181] Modules linked in: kvm_intel kvm [ 596.323422] [ 596.323435] Pid: 28353, comm: netstat Not tainted 3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736 [ 596.323745] RIP: 0010:[] [] tid_fd_revalidate+0x84/0x1a0 [ 596.329940] RSP: 0018:880658ab9d78 EFLAGS: 00010206 [ 596.330010] RAX: 8807f1195340 RBX: 8807d91bdd20 RCX: 007d [ 596.330010] RDX: RSI: RDI: 8807f1195340 [ 596.330010] RBP: 880658ab9d98 R08: 88083fcd6b30 R09: 8119fef0 [ 596.330010] R10: R11: 0202 R12: 8805eea4d480 [ 596.330010] R13: 8807d6f06000 R14: 8807f0fa1038 R15: 880658ab9e08 [ 596.330010] FS: 7f3414807700() GS:88083fcc() knlGS: [ 596.330010] CS: 0010 DS: ES: CR0: 80050033 [ 596.330010] CR2: 0001003b CR3: 00069bd85000 CR4: 07e0 [ 596.330010] DR0: 0001 DR1: 0002 DR2: 0001 [ 596.330010] DR3: 000a DR6: 0ff0 DR7: 0400 [ 596.330010] Process netstat (pid: 28353, threadinfo 880658ab8000, task 88074a897000) [ 596.330010] Stack: [ 596.330010] 8805eea4d480 0007 8807d91bdd20 8807dbbbac00 [ 596.330010] 880658ab9dc8 811a3880 ff0a0210 0001 [ 596.330010] 880658ab9e98 880593e3ac00 880658ab9e48 811a4cd6 [ 596.330010] Call Trace: [ 596.330010] [] proc_fd_instantiate+0x80/0xa0 [ 596.330010] [] proc_fill_cache+0x126/0x150 [ 596.330010] [] ? proc_fdinfo_instantiate+0x90/0x90 [ 596.330010] [] ? filldir64+0xe0/0xe0 [ 596.330010] [] proc_readfd_common+0xf6/0x1c0 [ 596.330010] [] ? proc_fdinfo_instantiate+0x90/0x90 [ 596.330010] [] ? filldir64+0xe0/0xe0 [ 596.330010] [] proc_readfd+0x15/0x20 [ 596.330010] [] vfs_readdir+0xa0/0xc0 [ 596.330010] [] ? filldir64+0xe0/0xe0 [ 596.330010] [] sys_getdents+0x8d/0x100 [ 596.330010] [] system_call_fastpath+0x16/0x1b [ 596.531217] Code: b8 00 00 00 48 8b 50 08 44 3b 32 0f 83 9e 00 00 00 45 89 f6 49 c1 e6 03 4c 03 72 08 49 8b 16 48 85 d2 0f 84 87 00 00 00 48 89 c7 <44> 8b 62 3c e8 13 29 ea ff 4c 89 ef e8 4b df ff ff 85 c0 0f 84 [ 596.531217] RIP [] tid_fd_revalidate+0x84/0x1a0 [ 596.531217] RSP [ 596.531217] CR2: 0001003b [ 596.533373] ---[ end trace 12628ad63724505a ]--- [ 620.908188] device vmEtap5 entered promiscuous mode [ 632.625058] device vmEtap22 entered promiscuous mode [ 637.628184] device vmEtap4 entered promiscuous mode [ 647.651842] device vmEtap6 entered promiscuous mode [ 869.373622] device vmEtap7 entered promiscuous mode [ 879.418886] device vmEtap8 entered promiscuous mode [ 884.422364] device vmEtap9 entered promiscuous mode [ 889.487014] device vmEtap10 entered promiscuous mode [ 898.926970] device vmEtap11 entered promiscuous mode [ 902.600030] hrtimer: interrupt took 23049 ns [ 909.244532] device vmEtap12 entered promiscuous mode [ 919.208239] device vmEtap13 entered promiscuous mode [ 929.798012] device vmEtap14 entered promiscuous mode [ 939.575998] device vmEtap15 entered promiscuous mode [ 949.673050] device vmEtap16 entered promiscuous mode [ 959.879484] device vmEtap17 entered promiscuous mode [ 970.117849] device vmEtap18 entered promiscuous mode [ 980.157065] device vmEtap19 entered promiscuous mode [ 990.493721] device vmEtap20 entered promiscuous mode [ 1000.683323] device vmEtap21 entered promiscuous mode [ 1010.820146] device vmEtap23 entered promiscuous mode [ 1179.360788] device vmEtap4 left promiscuous mode [ 1179.801638] device vmEtap5 left promiscuous mode [ 1180.297567] device vmEtap6 left promiscuous mode [ 1180.774054] device vmEtap7 left promiscuous mode [ 1181.170919] device vmEtap8 left promiscuous mode [ 1181.631908] device vmEtap9 left promiscuous mode [ 1182.116042] device vmEtap10 left promiscuous mode [ 1182.511330] device vmEtap11 left promiscuous mode [ 1182.929594] device vmEtap12 left promiscuous mode [ 1183.329183] device vmEtap13 left promiscuous mode [ 1183.720130] device vmEtap14 left promiscuous mode [ 1184.288507] device vmEtap15 left promiscuous mode [ 1184.679455] device vmEtap16 left promiscuous mode [ 1185.045020] device vmEtap17 left promiscuous mode [ 1185.410966] device vmEtap18 left promiscuous mode [ 1185.685902] BUG: unable to handle kernel paging request at 0001003b [ 1185.690492] IP: [] tid_fd_revalidate+0x84/0x1a0 [ 1185.690492] PGD 4d103d067 PUD 0 [ 1185.690492] Oops: [#2] SMP [ 1185.690492] CPU 2 Modules linked in: kvm_intel kvm [ 1185.690492] [
Re: oops in kernel ( 3.4.x -> 3.5rc )
On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: > [ 2384.900061] BUG: unable to handle kernel paging request at > 0001002f That '1' looks like a random bit flip. Try running memtest86. > [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W This wasn't the first problem either. > [ 2397.885344] BUG: unable to handle kernel paging request at > 0001003b Looks like the same flipped bit. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x - 3.5rc )
On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: [ 2384.900061] BUG: unable to handle kernel paging request at 0001002f That '1' looks like a random bit flip. Try running memtest86. [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W This wasn't the first problem either. [ 2397.885344] BUG: unable to handle kernel paging request at 0001003b Looks like the same flipped bit. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x - 3.5rc )
Hello; I can reproduce this problem on five differents servers, I can try a memtest86. regards, Nicolas Prochazka. complete dump : 596.322369] BUG: unable to handle kernel paging request at 0001003b [ 596.322622] IP: [811a3654] tid_fd_revalidate+0x84/0x1a0 [ 596.322828] PGD 7d6c20067 PUD 0 [ 596.322972] Oops: [#1] SMP [ 596.323115] CPU 3 [ 596.323181] Modules linked in: kvm_intel kvm [ 596.323422] [ 596.323435] Pid: 28353, comm: netstat Not tainted 3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736 [ 596.323745] RIP: 0010:[811a3654] [811a3654] tid_fd_revalidate+0x84/0x1a0 [ 596.329940] RSP: 0018:880658ab9d78 EFLAGS: 00010206 [ 596.330010] RAX: 8807f1195340 RBX: 8807d91bdd20 RCX: 007d [ 596.330010] RDX: RSI: RDI: 8807f1195340 [ 596.330010] RBP: 880658ab9d98 R08: 88083fcd6b30 R09: 8119fef0 [ 596.330010] R10: R11: 0202 R12: 8805eea4d480 [ 596.330010] R13: 8807d6f06000 R14: 8807f0fa1038 R15: 880658ab9e08 [ 596.330010] FS: 7f3414807700() GS:88083fcc() knlGS: [ 596.330010] CS: 0010 DS: ES: CR0: 80050033 [ 596.330010] CR2: 0001003b CR3: 00069bd85000 CR4: 07e0 [ 596.330010] DR0: 0001 DR1: 0002 DR2: 0001 [ 596.330010] DR3: 000a DR6: 0ff0 DR7: 0400 [ 596.330010] Process netstat (pid: 28353, threadinfo 880658ab8000, task 88074a897000) [ 596.330010] Stack: [ 596.330010] 8805eea4d480 0007 8807d91bdd20 8807dbbbac00 [ 596.330010] 880658ab9dc8 811a3880 ff0a0210 0001 [ 596.330010] 880658ab9e98 880593e3ac00 880658ab9e48 811a4cd6 [ 596.330010] Call Trace: [ 596.330010] [811a3880] proc_fd_instantiate+0x80/0xa0 [ 596.330010] [811a4cd6] proc_fill_cache+0x126/0x150 [ 596.330010] [811a3800] ? proc_fdinfo_instantiate+0x90/0x90 [ 596.330010] [811505a0] ? filldir64+0xe0/0xe0 [ 596.330010] [811a5006] proc_readfd_common+0xf6/0x1c0 [ 596.330010] [811a3800] ? proc_fdinfo_instantiate+0x90/0x90 [ 596.330010] [811505a0] ? filldir64+0xe0/0xe0 [ 596.330010] [811a5105] proc_readfd+0x15/0x20 [ 596.330010] [811507c0] vfs_readdir+0xa0/0xc0 [ 596.330010] [811505a0] ? filldir64+0xe0/0xe0 [ 596.330010] [8115096d] sys_getdents+0x8d/0x100 [ 596.330010] [81ae9c29] system_call_fastpath+0x16/0x1b [ 596.531217] Code: b8 00 00 00 48 8b 50 08 44 3b 32 0f 83 9e 00 00 00 45 89 f6 49 c1 e6 03 4c 03 72 08 49 8b 16 48 85 d2 0f 84 87 00 00 00 48 89 c7 44 8b 62 3c e8 13 29 ea ff 4c 89 ef e8 4b df ff ff 85 c0 0f 84 [ 596.531217] RIP [811a3654] tid_fd_revalidate+0x84/0x1a0 [ 596.531217] RSP 880658ab9d78 [ 596.531217] CR2: 0001003b [ 596.533373] ---[ end trace 12628ad63724505a ]--- [ 620.908188] device vmEtap5 entered promiscuous mode [ 632.625058] device vmEtap22 entered promiscuous mode [ 637.628184] device vmEtap4 entered promiscuous mode [ 647.651842] device vmEtap6 entered promiscuous mode [ 869.373622] device vmEtap7 entered promiscuous mode [ 879.418886] device vmEtap8 entered promiscuous mode [ 884.422364] device vmEtap9 entered promiscuous mode [ 889.487014] device vmEtap10 entered promiscuous mode [ 898.926970] device vmEtap11 entered promiscuous mode [ 902.600030] hrtimer: interrupt took 23049 ns [ 909.244532] device vmEtap12 entered promiscuous mode [ 919.208239] device vmEtap13 entered promiscuous mode [ 929.798012] device vmEtap14 entered promiscuous mode [ 939.575998] device vmEtap15 entered promiscuous mode [ 949.673050] device vmEtap16 entered promiscuous mode [ 959.879484] device vmEtap17 entered promiscuous mode [ 970.117849] device vmEtap18 entered promiscuous mode [ 980.157065] device vmEtap19 entered promiscuous mode [ 990.493721] device vmEtap20 entered promiscuous mode [ 1000.683323] device vmEtap21 entered promiscuous mode [ 1010.820146] device vmEtap23 entered promiscuous mode [ 1179.360788] device vmEtap4 left promiscuous mode [ 1179.801638] device vmEtap5 left promiscuous mode [ 1180.297567] device vmEtap6 left promiscuous mode [ 1180.774054] device vmEtap7 left promiscuous mode [ 1181.170919] device vmEtap8 left promiscuous mode [ 1181.631908] device vmEtap9 left promiscuous mode [ 1182.116042] device vmEtap10 left promiscuous mode [ 1182.511330] device vmEtap11 left promiscuous mode [ 1182.929594] device vmEtap12 left promiscuous mode [ 1183.329183] device vmEtap13 left promiscuous mode [ 1183.720130] device vmEtap14 left promiscuous mode [ 1184.288507] device vmEtap15 left promiscuous mode [ 1184.679455] device vmEtap16 left promiscuous mode [ 1185.045020] device vmEtap17 left promiscuous mode [ 1185.410966] device vmEtap18 left promiscuous mode [
Re: oops in kernel ( 3.4.x - 3.5rc )
On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: [ 2384.900061] BUG: unable to handle kernel paging request at 0001002f That '1' looks like a random bit flip. Try running memtest86. Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most likely a pointer to some other piece of a struct. However, taking a look at fs/files.c code, nothing seems suspicious. Nicolas, it wasn't clear to me if you had problems with 3.4 too. There has been some changes in fs/files.c on 3.4-rc1 in the piece of code where you hit the problem. What does your system exercise? Any chance you are using a lot of select, which has also been changed in those same patches to fs/files.c? Regards. Cascardo. [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W This wasn't the first problem either. [ 2397.885344] BUG: unable to handle kernel paging request at 0001003b Looks like the same flipped bit. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x - 3.5rc )
Hello the problem is occured with : - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, but i can if you want - linux kernel 3.5rc6 rc7 / do not test with other rc. the problem is not occured with : linux kernel 3.3.4 / 3.3.8 These servers are used for : - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of select i think) - do a lot of network tests with openvswitch I can test a kernel 3.4.x before and after a commit id (?) to find a regression. Regards, Nicolas. 2012/7/20 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com: On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: [ 2384.900061] BUG: unable to handle kernel paging request at 0001002f That '1' looks like a random bit flip. Try running memtest86. Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most likely a pointer to some other piece of a struct. However, taking a look at fs/files.c code, nothing seems suspicious. Nicolas, it wasn't clear to me if you had problems with 3.4 too. There has been some changes in fs/files.c on 3.4-rc1 in the piece of code where you hit the problem. What does your system exercise? Any chance you are using a lot of select, which has also been changed in those same patches to fs/files.c? Regards. Cascardo. [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W This wasn't the first problem either. [ 2397.885344] BUG: unable to handle kernel paging request at 0001003b Looks like the same flipped bit. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x - 3.5rc )
On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote: Hello the problem is occured with : - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, but i can if you want - linux kernel 3.5rc6 rc7 / do not test with other rc. the problem is not occured with : linux kernel 3.3.4 / 3.3.8 These servers are used for : - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of select i think) - do a lot of network tests with openvswitch I can test a kernel 3.4.x before and after a commit id (?) to find a regression. Regards, Nicolas. Can you try this commit 1fd36adcd98c14d2fd97f545293c488775cb2823? And the commit before it? 2012/7/20 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com: On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote: On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote: [ 2384.900061] BUG: unable to handle kernel paging request at 0001002f That '1' looks like a random bit flip. Try running memtest86. Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most likely a pointer to some other piece of a struct. However, taking a look at fs/files.c code, nothing seems suspicious. Nicolas, it wasn't clear to me if you had problems with 3.4 too. There has been some changes in fs/files.c on 3.4-rc1 in the piece of code where you hit the problem. What does your system exercise? Any chance you are using a lot of select, which has also been changed in those same patches to fs/files.c? Regards. Cascardo. [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G D W This wasn't the first problem either. [ 2397.885344] BUG: unable to handle kernel paging request at 0001003b Looks like the same flipped bit. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in kernel ( 3.4.x - 3.5rc )
Well done 1fd36adcd98c14d2fd97f545293c488775cb2823 : the bug occurs ( cf dump ) 1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 : the bug not occurs Regards, Nicolas Prochazka. dump / 1fd36adcd98c14d2fd97f545293c488775cb2823 lloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! VMtap: no IPv6 routers present alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 71 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 96 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 110 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! brE: no IPv6 routers present alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 121 not NULL! alloc_fd: slot 142 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 153 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 70 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 102 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 106 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 106 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 36 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 68 not NULL! alloc_fd: slot 100 not NULL! alloc_fd: slot 100 not NULL! [ cut here ] kernel BUG at fs/open.c:873! invalid opcode: [#1] SMP CPU 0 Modules linked in: kvm_intel kvm then BUG paging request as usual 2012/7/20 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com: On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote: Hello the problem is occured with : - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4, but i can if you want - linux kernel 3.5rc6 rc7 / do not test with other rc. the problem is not occured with : linux kernel 3.3.4 / 3.3.8 These servers are used for : - starting a lot of virtual machine with qemu-kvm ( ~ 40 ) ( lot of select i think) - do a lot of network tests with openvswitch I can test a kernel 3.4.x before and after a commit id (?) to find a regression.