Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-08-08 Thread nicolas prochazka
Hello
last git master checkout  seems  correct this problem
Thanks
Nicolas Prochazka.

2012/7/28 nicolas prochazka :
> hello again,
> bisect git gives ( after 13 steps) :
> 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit
>
> Regards,
> Nicolas Prochazka
>
> 2012/7/24 Thadeu Lima de Souza Cascardo :
>> On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote:
>>> Hello,
>>> I 'm trying differents versions with differents results
>>>
>>> - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not 
>>> reproductible
>>> - 3.5rc7  : bug is reproductible(cf new dump )
>>> - master branch  : bug is reproductible
>>>
>>>
>>>
>>> Regards,
>>> Nicolas Prochazka
>>>
>>>
>>
>> Hi, Nicolas.
>>
>> Can you try a bisect? I am also copying David Howells and Bobby Powers,
>> even though their set of patches don't seem to be the culprit.
>>
>> Regards.
>> Cascardo.
>>
>>> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
>>> request at 0001002f
>>> Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0
>>> Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
>>> Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
>>> Jul 23 21:10:05 positronic18696 CPU 1
>>> Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
>>> Jul 23 21:10:05 positronic18696
>>> Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
>>> 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
>>> Jul 23 21:10:05 positronic18696 RIP: 0010:[]
>>> [] dup_fd+0x160/0x2e0
>>> Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
>>> Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
>>> 8807ed95eec0 RCX: 0007
>>> Jul 23 21:10:05 positronic18696 RDX:  RSI:
>>> 0800 RDI: 8805a4d01d40
>>> Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
>>> 0020 R09: 81156694
>>> Jul 23 21:10:05 positronic18696 R10: 0001 R11:
>>>  R12: 8807ecf25000
>>> Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
>>> 0100 R15: 8807d3a61800
>>> Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
>>> GS:88083fc4() knlGS:
>>> Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
>>> 8005003b
>>> Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
>>> 00066ef9 CR4: 27e0
>>> Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
>>> 0002 DR2: 0001
>>> Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
>>> 0ff0 DR7: 0400
>>> Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
>>> threadinfo 880669ebc000, task 8806625ab000)
>>> Jul 23 21:10:05 positronic18696 Stack:
>>> Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
>>> 0020 8806627ee8c0
>>> Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
>>> 7f1a8e7199d0 8807bf41d000
>>> Jul 23 21:10:05 positronic18696  01200011
>>> 7f1a8e7199d0 
>>> Jul 23 21:10:05 positronic18696 Call Trace:
>>> Jul 23 21:10:05 positronic18696 [] 
>>> copy_process+0x931/0x13c0
>>> Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360
>>> Jul 23 21:10:05 positronic18696 [] ? 
>>> _raw_spin_lock+0xe/0x20
>>> Jul 23 21:10:05 positronic18696 [] ?
>>> __set_task_blocked+0x37/0x80
>>> Jul 23 21:10:05 positronic18696 [] ?
>>> __set_current_blocked+0x53/0x70
>>> Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30
>>> Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20
>>> Jul 23 21:10:05 positronic18696 [] ?
>>> system_call_fastpath+0x16/0x1b
>>> Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
>>> 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
>>> 08 00 00 00 31 c0 eb 15 90  48 ff 42 30 49 89 14 04 ff c1 48 83 c0
>>> 08 48 39 f0 74 24 49
>>> Jul 23 21:10:05 positronic18696 RIP  [] dup_fd+0x160/0x2e0
>>> Jul 23 21:10:05 positronic18696 RSP 
>>> Jul 23 21:10:05 positronic18696 CR2: 0001002f
>>> Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d

Re: oops in kernel ( 3.4.x - 3.5rc )

2012-08-08 Thread nicolas prochazka
Hello
last git master checkout  seems  correct this problem
Thanks
Nicolas Prochazka.

2012/7/28 nicolas prochazka prochazka.nico...@gmail.com:
 hello again,
 bisect git gives ( after 13 steps) :
 58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit

 Regards,
 Nicolas Prochazka

 2012/7/24 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com:
 On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote:
 Hello,
 I 'm trying differents versions with differents results

 - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not 
 reproductible
 - 3.5rc7  : bug is reproductible(cf new dump )
 - master branch  : bug is reproductible



 Regards,
 Nicolas Prochazka



 Hi, Nicolas.

 Can you try a bisect? I am also copying David Howells and Bobby Powers,
 even though their set of patches don't seem to be the culprit.

 Regards.
 Cascardo.

 Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
 request at 0001002f
 Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
 Jul 23 21:10:05 positronic18696 CPU 1
 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
 Jul 23 21:10:05 positronic18696
 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900]
 [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
 8807ed95eec0 RCX: 0007
 Jul 23 21:10:05 positronic18696 RDX:  RSI:
 0800 RDI: 8805a4d01d40
 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
 0020 R09: 81156694
 Jul 23 21:10:05 positronic18696 R10: 0001 R11:
  R12: 8807ecf25000
 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
 0100 R15: 8807d3a61800
 Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
 GS:88083fc4() knlGS:
 Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
 8005003b
 Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
 00066ef9 CR4: 27e0
 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
 0002 DR2: 0001
 Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
 0ff0 DR7: 0400
 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
 threadinfo 880669ebc000, task 8806625ab000)
 Jul 23 21:10:05 positronic18696 Stack:
 Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
 0020 8806627ee8c0
 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
 7f1a8e7199d0 8807bf41d000
 Jul 23 21:10:05 positronic18696  01200011
 7f1a8e7199d0 
 Jul 23 21:10:05 positronic18696 Call Trace:
 Jul 23 21:10:05 positronic18696 [81040441] 
 copy_process+0x931/0x13c0
 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360
 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? 
 _raw_spin_lock+0xe/0x20
 Jul 23 21:10:05 positronic18696 [810559e7] ?
 __set_task_blocked+0x37/0x80
 Jul 23 21:10:05 positronic18696 [81055a83] ?
 __set_current_blocked+0x53/0x70
 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30
 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20
 Jul 23 21:10:05 positronic18696 [81ac4929] ?
 system_call_fastpath+0x16/0x1b
 Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0
 08 48 39 f0 74 24 49
 Jul 23 21:10:05 positronic18696 RIP  [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90
 Jul 23 21:10:05 positronic18696 CR2: 0001002f
 Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]---
 Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode
 Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
 request at 0001003b
 Jul 23 21:10:05 positronic18696 IP: [8119f684]
 tid_fd_revalidate+0x84/0x1a0
 Jul 23 21:10:05

Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-07-28 Thread nicolas prochazka
hello again,
bisect git gives ( after 13 steps) :
58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit

Regards,
Nicolas Prochazka

2012/7/24 Thadeu Lima de Souza Cascardo :
> On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote:
>> Hello,
>> I 'm trying differents versions with differents results
>>
>> - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not reproductible
>> - 3.5rc7  : bug is reproductible(cf new dump )
>> - master branch  : bug is reproductible
>>
>>
>>
>> Regards,
>> Nicolas Prochazka
>>
>>
>
> Hi, Nicolas.
>
> Can you try a bisect? I am also copying David Howells and Bobby Powers,
> even though their set of patches don't seem to be the culprit.
>
> Regards.
> Cascardo.
>
>> Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
>> request at 0001002f
>> Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0
>> Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
>> Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
>> Jul 23 21:10:05 positronic18696 CPU 1
>> Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
>> Jul 23 21:10:05 positronic18696
>> Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
>> 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
>> Jul 23 21:10:05 positronic18696 RIP: 0010:[]
>> [] dup_fd+0x160/0x2e0
>> Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
>> Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
>> 8807ed95eec0 RCX: 0007
>> Jul 23 21:10:05 positronic18696 RDX:  RSI:
>> 0800 RDI: 8805a4d01d40
>> Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
>> 0020 R09: 81156694
>> Jul 23 21:10:05 positronic18696 R10: 0001 R11:
>>  R12: 8807ecf25000
>> Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
>> 0100 R15: 8807d3a61800
>> Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
>> GS:88083fc4() knlGS:
>> Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
>> 8005003b
>> Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
>> 00066ef9 CR4: 27e0
>> Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
>> 0002 DR2: 0001
>> Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
>> 0ff0 DR7: 0400
>> Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
>> threadinfo 880669ebc000, task 8806625ab000)
>> Jul 23 21:10:05 positronic18696 Stack:
>> Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
>> 0020 8806627ee8c0
>> Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
>> 7f1a8e7199d0 8807bf41d000
>> Jul 23 21:10:05 positronic18696  01200011
>> 7f1a8e7199d0 
>> Jul 23 21:10:05 positronic18696 Call Trace:
>> Jul 23 21:10:05 positronic18696 [] 
>> copy_process+0x931/0x13c0
>> Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360
>> Jul 23 21:10:05 positronic18696 [] ? 
>> _raw_spin_lock+0xe/0x20
>> Jul 23 21:10:05 positronic18696 [] ?
>> __set_task_blocked+0x37/0x80
>> Jul 23 21:10:05 positronic18696 [] ?
>> __set_current_blocked+0x53/0x70
>> Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30
>> Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20
>> Jul 23 21:10:05 positronic18696 [] ?
>> system_call_fastpath+0x16/0x1b
>> Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
>> 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
>> 08 00 00 00 31 c0 eb 15 90  48 ff 42 30 49 89 14 04 ff c1 48 83 c0
>> 08 48 39 f0 74 24 49
>> Jul 23 21:10:05 positronic18696 RIP  [] dup_fd+0x160/0x2e0
>> Jul 23 21:10:05 positronic18696 RSP 
>> Jul 23 21:10:05 positronic18696 CR2: 0001002f
>> Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]---
>> Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode
>> Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode
>> Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode
>> Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode
>> Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode
>> Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode
>> Jul 23 21:10:05 posit

Re: oops in kernel ( 3.4.x - 3.5rc )

2012-07-28 Thread nicolas prochazka
hello again,
bisect git gives ( after 13 steps) :
58bca4a8fa90fcf9069379653b396b2cec642f7f is the first bad commit

Regards,
Nicolas Prochazka

2012/7/24 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com:
 On Mon, Jul 23, 2012 at 11:15:09PM +0200, nicolas prochazka wrote:
 Hello,
 I 'm trying differents versions with differents results

 - commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not reproductible
 - 3.5rc7  : bug is reproductible(cf new dump )
 - master branch  : bug is reproductible



 Regards,
 Nicolas Prochazka



 Hi, Nicolas.

 Can you try a bisect? I am also copying David Howells and Bobby Powers,
 even though their set of patches don't seem to be the culprit.

 Regards.
 Cascardo.

 Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
 request at 0001002f
 Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
 Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
 Jul 23 21:10:05 positronic18696 CPU 1
 Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
 Jul 23 21:10:05 positronic18696
 Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
 3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
 Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900]
 [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
 Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
 8807ed95eec0 RCX: 0007
 Jul 23 21:10:05 positronic18696 RDX:  RSI:
 0800 RDI: 8805a4d01d40
 Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
 0020 R09: 81156694
 Jul 23 21:10:05 positronic18696 R10: 0001 R11:
  R12: 8807ecf25000
 Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
 0100 R15: 8807d3a61800
 Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
 GS:88083fc4() knlGS:
 Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
 8005003b
 Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
 00066ef9 CR4: 27e0
 Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
 0002 DR2: 0001
 Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
 0ff0 DR7: 0400
 Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
 threadinfo 880669ebc000, task 8806625ab000)
 Jul 23 21:10:05 positronic18696 Stack:
 Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
 0020 8806627ee8c0
 Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
 7f1a8e7199d0 8807bf41d000
 Jul 23 21:10:05 positronic18696  01200011
 7f1a8e7199d0 
 Jul 23 21:10:05 positronic18696 Call Trace:
 Jul 23 21:10:05 positronic18696 [81040441] 
 copy_process+0x931/0x13c0
 Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360
 Jul 23 21:10:05 positronic18696 [81ac3b7e] ? 
 _raw_spin_lock+0xe/0x20
 Jul 23 21:10:05 positronic18696 [810559e7] ?
 __set_task_blocked+0x37/0x80
 Jul 23 21:10:05 positronic18696 [81055a83] ?
 __set_current_blocked+0x53/0x70
 Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30
 Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20
 Jul 23 21:10:05 positronic18696 [81ac4929] ?
 system_call_fastpath+0x16/0x1b
 Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
 4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
 08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0
 08 48 39 f0 74 24 49
 Jul 23 21:10:05 positronic18696 RIP  [81156900] dup_fd+0x160/0x2e0
 Jul 23 21:10:05 positronic18696 RSP 880669ebdd90
 Jul 23 21:10:05 positronic18696 CR2: 0001002f
 Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]---
 Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode
 Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode
 Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
 request at 0001003b
 Jul 23 21:10:05 positronic18696 IP: [8119f684]
 tid_fd_revalidate+0x84/0x1a0
 Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0
 Jul 23 21:10:05 positronic18696 Oops:  [#2] SMP
 Jul 23 21:10:05 positronic18696 CPU 0
 Jul 23 21:10:05 positronic18696

Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-07-23 Thread nicolas prochazka
Hello,
I 'm trying differents versions with differents results

- commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not reproductible
- 3.5rc7  : bug is reproductible(cf new dump )
- master branch  : bug is reproductible



Regards,
Nicolas Prochazka


Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
request at 0001002f
Jul 23 21:10:05 positronic18696 IP: [] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
Jul 23 21:10:05 positronic18696 CPU 1
Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
Jul 23 21:10:05 positronic18696
Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
Jul 23 21:10:05 positronic18696 RIP: 0010:[]
[] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
8807ed95eec0 RCX: 0007
Jul 23 21:10:05 positronic18696 RDX:  RSI:
0800 RDI: 8805a4d01d40
Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
0020 R09: 81156694
Jul 23 21:10:05 positronic18696 R10: 0001 R11:
 R12: 8807ecf25000
Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
0100 R15: 8807d3a61800
Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
GS:88083fc4() knlGS:
Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
8005003b
Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
00066ef9 CR4: 27e0
Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
0002 DR2: 0001
Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
0ff0 DR7: 0400
Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
threadinfo 880669ebc000, task 8806625ab000)
Jul 23 21:10:05 positronic18696 Stack:
Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
0020 8806627ee8c0
Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
7f1a8e7199d0 8807bf41d000
Jul 23 21:10:05 positronic18696  01200011
7f1a8e7199d0 
Jul 23 21:10:05 positronic18696 Call Trace:
Jul 23 21:10:05 positronic18696 [] copy_process+0x931/0x13c0
Jul 23 21:10:05 positronic18696 [] do_fork+0x54/0x360
Jul 23 21:10:05 positronic18696 [] ? _raw_spin_lock+0xe/0x20
Jul 23 21:10:05 positronic18696 [] ?
__set_task_blocked+0x37/0x80
Jul 23 21:10:05 positronic18696 [] ?
__set_current_blocked+0x53/0x70
Jul 23 21:10:05 positronic18696 [] sys_clone+0x28/0x30
Jul 23 21:10:05 positronic18696 [] stub_clone+0x13/0x20
Jul 23 21:10:05 positronic18696 [] ?
system_call_fastpath+0x16/0x1b
Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
08 00 00 00 31 c0 eb 15 90  48 ff 42 30 49 89 14 04 ff c1 48 83 c0
08 48 39 f0 74 24 49
Jul 23 21:10:05 positronic18696 RIP  [] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 RSP 
Jul 23 21:10:05 positronic18696 CR2: 0001002f
Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]---
Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode
Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
request at 0001003b
Jul 23 21:10:05 positronic18696 IP: []
tid_fd_revalidate+0x84/0x1a0
Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0
Jul 23 21:10:05 positronic18696 Oops:  [#2] SMP
Jul 23 21:10:05 positronic18696 CPU 0
Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
Jul 23 21:10:05 positronic18696
Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G
   D  3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
Jul 23 21:10:05 positronic18696 RIP: 0010:[]
[] tid_fd_revalidate+0x84/0x1a0
Jul 23 21:10:05 positronic18696 RSP: 0018:88053f965d78  EFLAGS: 00010206
Jul 23 21:10:05 positronic18696 RAX: 8807ee3bf700 RBX:
8807ed1d9118 RCX: 007e
Jul 23 21:10:05 positronic18696 RDX:  RSI:
 RDI: 8807ee3bf700
Jul 23 21:10:05 positronic18696 RBP: 88053f965d98 R08:
88083fc16a10 R09: 8119bfe0
Jul 23 21:10:05 positronic18696 R10:  R11:
0206 R12: 8807d8afaa80
Jul 23 21:10:05

Re: oops in kernel ( 3.4.x - 3.5rc )

2012-07-23 Thread nicolas prochazka
Hello,
I 'm trying differents versions with differents results

- commit f044db4cb4bf16893812d35b5fbeaaf3e30c9215  : bug is not reproductible
- 3.5rc7  : bug is reproductible(cf new dump )
- master branch  : bug is reproductible



Regards,
Nicolas Prochazka


Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
request at 0001002f
Jul 23 21:10:05 positronic18696 IP: [81156900] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 PGD 6627ea067 PUD 0
Jul 23 21:10:05 positronic18696 Oops: 0002 [#1] SMP
Jul 23 21:10:05 positronic18696 CPU 1
Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
Jul 23 21:10:05 positronic18696
Jul 23 21:10:05 positronic18696 Pid: 17596, comm: queue.sh Not tainted
3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
Jul 23 21:10:05 positronic18696 RIP: 0010:[81156900]
[81156900] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 RSP: 0018:880669ebdd90  EFLAGS: 00010206
Jul 23 21:10:05 positronic18696 RAX: 0038 RBX:
8807ed95eec0 RCX: 0007
Jul 23 21:10:05 positronic18696 RDX:  RSI:
0800 RDI: 8805a4d01d40
Jul 23 21:10:05 positronic18696 RBP: 880669ebddf0 R08:
0020 R09: 81156694
Jul 23 21:10:05 positronic18696 R10: 0001 R11:
 R12: 8807ecf25000
Jul 23 21:10:05 positronic18696 R13: 8805a4d01d80 R14:
0100 R15: 8807d3a61800
Jul 23 21:10:05 positronic18696 FS:  7f1a8e719700()
GS:88083fc4() knlGS:
Jul 23 21:10:05 positronic18696 CS:  0010 DS:  ES:  CR0:
8005003b
Jul 23 21:10:05 positronic18696 CR2: 0001002f CR3:
00066ef9 CR4: 27e0
Jul 23 21:10:05 positronic18696 DR0: 0001 DR1:
0002 DR2: 0001
Jul 23 21:10:05 positronic18696 DR3: 000a DR6:
0ff0 DR7: 0400
Jul 23 21:10:05 positronic18696 Process queue.sh (pid: 17596,
threadinfo 880669ebc000, task 8806625ab000)
Jul 23 21:10:05 positronic18696 Stack:
Jul 23 21:10:05 positronic18696 880669ebdda0 00018102db49
0020 8806627ee8c0
Jul 23 21:10:05 positronic18696 8807eefc1608 8807eefc1680
7f1a8e7199d0 8807bf41d000
Jul 23 21:10:05 positronic18696  01200011
7f1a8e7199d0 
Jul 23 21:10:05 positronic18696 Call Trace:
Jul 23 21:10:05 positronic18696 [81040441] copy_process+0x931/0x13c0
Jul 23 21:10:05 positronic18696 [81041024] do_fork+0x54/0x360
Jul 23 21:10:05 positronic18696 [81ac3b7e] ? _raw_spin_lock+0xe/0x20
Jul 23 21:10:05 positronic18696 [810559e7] ?
__set_task_blocked+0x37/0x80
Jul 23 21:10:05 positronic18696 [81055a83] ?
__set_current_blocked+0x53/0x70
Jul 23 21:10:05 positronic18696 [8100c098] sys_clone+0x28/0x30
Jul 23 21:10:05 positronic18696 [81ac4bb3] stub_clone+0x13/0x20
Jul 23 21:10:05 positronic18696 [81ac4929] ?
system_call_fastpath+0x16/0x1b
Jul 23 21:10:05 positronic18696 Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10
4c 89 c2 e8 08 82 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5
08 00 00 00 31 c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0
08 48 39 f0 74 24 49
Jul 23 21:10:05 positronic18696 RIP  [81156900] dup_fd+0x160/0x2e0
Jul 23 21:10:05 positronic18696 RSP 880669ebdd90
Jul 23 21:10:05 positronic18696 CR2: 0001002f
Jul 23 21:10:05 positronic18696 ---[ end trace ccf5b66c39d92756 ]---
Jul 23 21:10:05 positronic18696 device vmtap35 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap35 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap36 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap36 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap37 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap37 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmtap38 left promiscuous mode
Jul 23 21:10:05 positronic18696 device vmEtap38 left promiscuous mode
Jul 23 21:10:05 positronic18696 BUG: unable to handle kernel paging
request at 0001003b
Jul 23 21:10:05 positronic18696 IP: [8119f684]
tid_fd_revalidate+0x84/0x1a0
Jul 23 21:10:05 positronic18696 PGD 598c9e067 PUD 0
Jul 23 21:10:05 positronic18696 Oops:  [#2] SMP
Jul 23 21:10:05 positronic18696 CPU 0
Jul 23 21:10:05 positronic18696 Modules linked in: kvm_intel kvm
Jul 23 21:10:05 positronic18696
Jul 23 21:10:05 positronic18696 Pid: 21815, comm: netstat Tainted: G
   D  3.5.0-rc7-dirty #6 Dell Inc. PowerEdge M600/0MY736
Jul 23 21:10:05 positronic18696 RIP: 0010:[8119f684]
[8119f684] tid_fd_revalidate+0x84/0x1a0
Jul 23 21:10:05 positronic18696 RSP: 0018:88053f965d78  EFLAGS: 00010206
Jul 23 21:10:05 positronic18696 RAX: 8807ee3bf700 RBX:
8807ed1d9118 RCX: 007e
Jul 23 21:10:05 positronic18696 RDX:  RSI

Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-07-20 Thread nicolas prochazka
Well done
1fd36adcd98c14d2fd97f545293c488775cb2823  :  the bug occurs   ( cf dump )
1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 :  the bug not occurs

Regards,
Nicolas Prochazka.

dump  / 1fd36adcd98c14d2fd97f545293c488775cb2823
lloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
VMtap: no IPv6 routers present
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 96 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 110 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
brE: no IPv6 routers present
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 70 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 102 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 106 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 106 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
[ cut here ]
kernel BUG at fs/open.c:873!
invalid opcode:  [#1] SMP
CPU 0
Modules linked in: kvm_intel kvm

then BUG paging request as usual


2012/7/20 Thadeu Lima de Souza Cascardo :
> On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote:
>> Hello
>> the problem is occured with  :
>> - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4,
>> but i can if you want
>> - linux kernel 3.5rc6 rc7  / do not test with other rc.
>>
>> the problem is not occured with :
>> linux kernel 3.3.4 / 3.3.8
>>
>> These servers are used for :
>> - starting a lot of virtual machine with qemu-kvm ( ~ 40 )   ( lot of
>> select i think)
>> - do a lot of network tests with openvswitch
>>

Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello
the problem is occured with  :
- linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4,
but i can if you want
- linux kernel 3.5rc6 rc7  / do not test with other rc.

the problem is not occured with :
linux kernel 3.3.4 / 3.3.8

These servers are used for :
- starting a lot of virtual machine with qemu-kvm ( ~ 40 )   ( lot of
select i think)
- do a lot of network tests with openvswitch

I can test a kernel 3.4.x before and after a commit id (?) to find a regression.

Regards,
Nicolas.


2012/7/20 Thadeu Lima de Souza Cascardo :
> On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote:
>> On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote:
>>
>>  > [ 2384.900061] BUG: unable to handle kernel paging request at 
>> 0001002f
>>
>> That '1' looks like a random bit flip. Try running memtest86.
>>
>
> Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most
> likely a pointer to some other piece of a struct. However, taking a look
> at fs/files.c code, nothing seems suspicious.
>
> Nicolas, it wasn't clear to me if you had problems with 3.4 too. There
> has been some changes in fs/files.c on 3.4-rc1 in the piece of code
> where you hit the problem.
>
> What does your system exercise? Any chance you are using a lot of
> select, which has also been changed in those same patches to fs/files.c?
>
> Regards.
> Cascardo.
>
>
>>  > [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G  D W
>>
>> This wasn't the first problem either.
>>
>>  > [ 2397.885344] BUG: unable to handle kernel paging request at 
>> 0001003b
>>
>> Looks like the same flipped bit.
>>
>>   Dave
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops in kernel ( 3.4.x -> 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello;
I can reproduce this problem on five differents servers,
I can try a memtest86.
regards,
Nicolas Prochazka.
complete dump :

 596.322369] BUG: unable to handle kernel paging request at 0001003b
[  596.322622] IP: [] tid_fd_revalidate+0x84/0x1a0
[  596.322828] PGD 7d6c20067 PUD 0
[  596.322972] Oops:  [#1] SMP
[  596.323115] CPU 3
[  596.323181] Modules linked in: kvm_intel kvm
[  596.323422]
[  596.323435] Pid: 28353, comm: netstat Not tainted 3.5.0-rc7 #4 Dell
Inc. PowerEdge M600/0MY736
[  596.323745] RIP: 0010:[]  []
tid_fd_revalidate+0x84/0x1a0
[  596.329940] RSP: 0018:880658ab9d78  EFLAGS: 00010206
[  596.330010] RAX: 8807f1195340 RBX: 8807d91bdd20 RCX: 007d
[  596.330010] RDX:  RSI:  RDI: 8807f1195340
[  596.330010] RBP: 880658ab9d98 R08: 88083fcd6b30 R09: 8119fef0
[  596.330010] R10:  R11: 0202 R12: 8805eea4d480
[  596.330010] R13: 8807d6f06000 R14: 8807f0fa1038 R15: 880658ab9e08
[  596.330010] FS:  7f3414807700() GS:88083fcc()
knlGS:
[  596.330010] CS:  0010 DS:  ES:  CR0: 80050033
[  596.330010] CR2: 0001003b CR3: 00069bd85000 CR4: 07e0
[  596.330010] DR0: 0001 DR1: 0002 DR2: 0001
[  596.330010] DR3: 000a DR6: 0ff0 DR7: 0400
[  596.330010] Process netstat (pid: 28353, threadinfo
880658ab8000, task 88074a897000)
[  596.330010] Stack:
[  596.330010]  8805eea4d480 0007 8807d91bdd20
8807dbbbac00
[  596.330010]  880658ab9dc8 811a3880 ff0a0210
0001
[  596.330010]  880658ab9e98 880593e3ac00 880658ab9e48
811a4cd6
[  596.330010] Call Trace:
[  596.330010]  [] proc_fd_instantiate+0x80/0xa0
[  596.330010]  [] proc_fill_cache+0x126/0x150
[  596.330010]  [] ? proc_fdinfo_instantiate+0x90/0x90
[  596.330010]  [] ? filldir64+0xe0/0xe0
[  596.330010]  [] proc_readfd_common+0xf6/0x1c0
[  596.330010]  [] ? proc_fdinfo_instantiate+0x90/0x90
[  596.330010]  [] ? filldir64+0xe0/0xe0
[  596.330010]  [] proc_readfd+0x15/0x20
[  596.330010]  [] vfs_readdir+0xa0/0xc0
[  596.330010]  [] ? filldir64+0xe0/0xe0
[  596.330010]  [] sys_getdents+0x8d/0x100
[  596.330010]  [] system_call_fastpath+0x16/0x1b
[  596.531217] Code: b8 00 00 00 48 8b 50 08 44 3b 32 0f 83 9e 00 00
00 45 89 f6 49 c1 e6 03 4c 03 72 08 49 8b 16 48 85 d2 0f 84 87 00 00
00 48 89 c7 <44> 8b 62 3c e8 13 29 ea ff 4c 89 ef e8 4b df ff ff 85 c0
0f 84
[  596.531217] RIP  [] tid_fd_revalidate+0x84/0x1a0
[  596.531217]  RSP 
[  596.531217] CR2: 0001003b
[  596.533373] ---[ end trace 12628ad63724505a ]---
[  620.908188] device vmEtap5 entered promiscuous mode
[  632.625058] device vmEtap22 entered promiscuous mode
[  637.628184] device vmEtap4 entered promiscuous mode
[  647.651842] device vmEtap6 entered promiscuous mode
[  869.373622] device vmEtap7 entered promiscuous mode
[  879.418886] device vmEtap8 entered promiscuous mode
[  884.422364] device vmEtap9 entered promiscuous mode
[  889.487014] device vmEtap10 entered promiscuous mode
[  898.926970] device vmEtap11 entered promiscuous mode
[  902.600030] hrtimer: interrupt took 23049 ns
[  909.244532] device vmEtap12 entered promiscuous mode
[  919.208239] device vmEtap13 entered promiscuous mode
[  929.798012] device vmEtap14 entered promiscuous mode
[  939.575998] device vmEtap15 entered promiscuous mode
[  949.673050] device vmEtap16 entered promiscuous mode
[  959.879484] device vmEtap17 entered promiscuous mode
[  970.117849] device vmEtap18 entered promiscuous mode
[  980.157065] device vmEtap19 entered promiscuous mode
[  990.493721] device vmEtap20 entered promiscuous mode
[ 1000.683323] device vmEtap21 entered promiscuous mode
[ 1010.820146] device vmEtap23 entered promiscuous mode
[ 1179.360788] device vmEtap4 left promiscuous mode
[ 1179.801638] device vmEtap5 left promiscuous mode
[ 1180.297567] device vmEtap6 left promiscuous mode
[ 1180.774054] device vmEtap7 left promiscuous mode
[ 1181.170919] device vmEtap8 left promiscuous mode
[ 1181.631908] device vmEtap9 left promiscuous mode
[ 1182.116042] device vmEtap10 left promiscuous mode
[ 1182.511330] device vmEtap11 left promiscuous mode
[ 1182.929594] device vmEtap12 left promiscuous mode
[ 1183.329183] device vmEtap13 left promiscuous mode
[ 1183.720130] device vmEtap14 left promiscuous mode
[ 1184.288507] device vmEtap15 left promiscuous mode
[ 1184.679455] device vmEtap16 left promiscuous mode
[ 1185.045020] device vmEtap17 left promiscuous mode
[ 1185.410966] device vmEtap18 left promiscuous mode
[ 1185.685902] BUG: unable to handle kernel paging request at 0001003b
[ 1185.690492] IP: [] tid_fd_revalidate+0x84/0x1a0
[ 1185.690492] PGD 4d103d067 PUD 0
[ 1185.690492] Oops:  [#2] SMP
[ 1185.690492] CPU 2 Modules linked in: kvm_intel kvm
[ 1185.690492]
[ 1185.

oops in kernel ( 3.4.x -> 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello,
Since i 've updated our server from linux kernel 3.3.8 to linux kernel
3.4.6 or linux kernel 3.5rc7,
we can observe a lot of oops and big load  on system.

Example : Linux positronic836 3.5.0-rc7 #4 SMP Fri Jul 20 11:47:12 UTC
2012 x86_64 Intel(R) Xeon(R) CPU E5345 @ 2.33GHz GenuineIntel
GNU/Linux

Regards,
Nicolas Prochazka.

[ 2384.900061] BUG: unable to handle kernel paging request at 0001002f
[ 2384.900307] IP: [] dup_fd+0x160/0x2e0
[ 2384.905492] PGD 4e9bab067 PUD 0
[ 2384.910010] Oops: 0002 [#4] SMP
[ 2384.910010] CPU 6
[ 2384.910010] Modules linked in: kvm_intel kvm
[ 2384.910010]
[ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G  D W
3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736
[ 2384.910010] RIP: 0010:[]  []
dup_fd+0x160/0x2e0
[ 2384.910010] RSP: 0018:88049fef1d90  EFLAGS: 00010206
[ 2384.910010] RAX: 0038 RBX: 8806a0ac4580 RCX: 0007
[ 2384.910010] RDX:  RSI: 0800 RDI: 880658a85400
[ 2384.910010] RBP: 88049fef1df0 R08: 0020 R09: 81159fe4
[ 2384.910010] R10: ea0020495fd8 R11:  R12: 8807f1eaa000
[ 2384.970401] R13: 880658a85380 R14: 0100 R15: 8807d882
[ 2384.970401] FS:  7f169ffdb700() GS:88083fd8()
knlGS:
[ 2384.970401] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2384.970401] CR2: 0001002f CR3: 00059a96e000 CR4: 27e0
[ 2384.970401] DR0: 0001 DR1: 0002 DR2: 0001
[ 2384.970401] DR3: 000a DR6: 0ff0 DR7: 0400
[ 2384.970401] Process queue.sh (pid: 23838, threadinfo
88049fef, task 8805d707a000)
[ 2384.970401] Stack:
[ 2384.970401]  88049fef1da0 00018102db49 0020
88045da607c0
[ 2384.970401]  8805321a7c88 8805321a7d00 7f169ffdb9d0
8804cd075000
[ 2384.970401]   01200011 7f169ffdb9d0

[ 2384.970401] Call Trace:
[ 2384.970401]  [] copy_process+0x93c/0x13d0
[ 2384.970401]  [] do_fork+0x54/0x360
[ 2384.970401]  [] ? _raw_spin_lock+0xe/0x20
[ 2384.970401]  [] ? __set_task_blocked+0x37/0x80
[ 2384.970401]  [] sys_clone+0x28/0x30
[ 2384.970401]  [] stub_clone+0x13/0x20
[ 2384.970401]  [] ? system_call_fastpath+0x16/0x1b
[ 2384.970401] Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 a8
9c 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31
c0 eb 15 90  48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74
24 49
[ 2384.970401] RIP  [] dup_fd+0x160/0x2e0
[ 2384.970401]  RSP 
[ 2384.970401] CR2: 0001002f
[ 2385.131572] ---[ end trace 12628ad63724505e ]---
[ 2385.550858] device vmEtap18 left promiscuous mode
[ 2385.953882] device vmEtap19 left promiscuous mode
[ 2386.318553] device vmEtap20 left promiscuous mode
[ 2386.714127] device vmEtap21 left promiscuous mode
[ 2387.131308] device vmEtap22 left promiscuous mode
[ 2387.498609] device vmEtap23 left promiscuous mode
[ 2397.885344] BUG: unable to handle kernel paging request at 0001003b
[ 2397.885596] IP: [] tid_fd_revalidate+0x84/0x1a0
[ 2397.885804] PGD 4ce59d067 PUD 0
[ 2397.885950] Oops:  [#5] SMP
[ 2397.886097] CPU 3
[ 2397.886163] Modules linked in: kvm_intel kvm
[ 2397.886408]
[ 2397.886425] Pid: 25760, comm: netstat Tainted: G  D W
3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736
[ 2397.886747] RIP: 0010:[]  []
tid_fd_revalidate+0x84/0x1a0
[ 2397.887019] RSP: 0018:88040ca4bd78  EFLAGS: 00010206
[ 2397.887187] RAX: 8807f1f26ec0 RBX: 8807d918e928 RCX: 007d
[ 2397.887392] RDX:  RSI:  RDI: 8807f1f26ec0
[ 2397.887596] RBP: 88040ca4bd98 R08: 88083fcd6b30 R09: 8119fef0
[ 2397.887801] R10:  R11: 0202 R12: 8807d93cd3c0
[ 2397.891771] R13: 8807c1366000 R14: 8807cf615038 R15: 88040ca4be08
[ 2397.891771] FS:  7f029b1b5700() GS:88083fcc()
knlGS:
[ 2397.891771] CS:  0010 DS:  ES:  CR0: 80050033
[ 2397.891771] CR2: 0001003b CR3: 0004d0fd9000 CR4: 07e0
[ 2397.891771] DR0: 0001 DR1: 0002 DR2: 0001
[ 2397.891771] DR3: 000a DR6: 0ff0 DR7: 0400
[ 2397.891771] Process netstat (pid: 25760, threadinfo
88040ca4a000, task 88065897f000)
[ 2397.891771] Stack:
[ 2397.891771]  8807d93cd3c0 0007 8807d918e928
8807db8db240
[ 2397.891771]  88040ca4bdc8 811a3880 ff0a0210
0001
[ 2397.963460]  88040ca4be98 8807d6dcb600 88040ca4be48
811a4cd6
[ 2397.963460] Call Trace:
[ 2397.963460]  [] proc_fd_instantiate+0x80/0xa0
[ 2397.963460]  [] proc_fill_cache+0x126/0x150
[ 2397.963460]  [] ? proc_fdinfo_instantiate+0x90/0x90
[ 2397.963460]  [] ? filldir64+0xe0/0xe0
[ 2397.963460]  [] proc_readfd_common+0xf6/0x1c0
[ 2397.963460

oops in kernel ( 3.4.x - 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello,
Since i 've updated our server from linux kernel 3.3.8 to linux kernel
3.4.6 or linux kernel 3.5rc7,
we can observe a lot of oops and big load  on system.

Example : Linux positronic836 3.5.0-rc7 #4 SMP Fri Jul 20 11:47:12 UTC
2012 x86_64 Intel(R) Xeon(R) CPU E5345 @ 2.33GHz GenuineIntel
GNU/Linux

Regards,
Nicolas Prochazka.

[ 2384.900061] BUG: unable to handle kernel paging request at 0001002f
[ 2384.900307] IP: [8115a250] dup_fd+0x160/0x2e0
[ 2384.905492] PGD 4e9bab067 PUD 0
[ 2384.910010] Oops: 0002 [#4] SMP
[ 2384.910010] CPU 6
[ 2384.910010] Modules linked in: kvm_intel kvm
[ 2384.910010]
[ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G  D W
3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736
[ 2384.910010] RIP: 0010:[8115a250]  [8115a250]
dup_fd+0x160/0x2e0
[ 2384.910010] RSP: 0018:88049fef1d90  EFLAGS: 00010206
[ 2384.910010] RAX: 0038 RBX: 8806a0ac4580 RCX: 0007
[ 2384.910010] RDX:  RSI: 0800 RDI: 880658a85400
[ 2384.910010] RBP: 88049fef1df0 R08: 0020 R09: 81159fe4
[ 2384.910010] R10: ea0020495fd8 R11:  R12: 8807f1eaa000
[ 2384.970401] R13: 880658a85380 R14: 0100 R15: 8807d882
[ 2384.970401] FS:  7f169ffdb700() GS:88083fd8()
knlGS:
[ 2384.970401] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2384.970401] CR2: 0001002f CR3: 00059a96e000 CR4: 27e0
[ 2384.970401] DR0: 0001 DR1: 0002 DR2: 0001
[ 2384.970401] DR3: 000a DR6: 0ff0 DR7: 0400
[ 2384.970401] Process queue.sh (pid: 23838, threadinfo
88049fef, task 8805d707a000)
[ 2384.970401] Stack:
[ 2384.970401]  88049fef1da0 00018102db49 0020
88045da607c0
[ 2384.970401]  8805321a7c88 8805321a7d00 7f169ffdb9d0
8804cd075000
[ 2384.970401]   01200011 7f169ffdb9d0

[ 2384.970401] Call Trace:
[ 2384.970401]  [8104056c] copy_process+0x93c/0x13d0
[ 2384.970401]  [81041154] do_fork+0x54/0x360
[ 2384.970401]  [81ae8eae] ? _raw_spin_lock+0xe/0x20
[ 2384.970401]  [81055c67] ? __set_task_blocked+0x37/0x80
[ 2384.970401]  [8100c1b8] sys_clone+0x28/0x30
[ 2384.970401]  [81ae9eb3] stub_clone+0x13/0x20
[ 2384.970401]  [81ae9c29] ? system_call_fastpath+0x16/0x1b
[ 2384.970401] Code: 8b 45 b0 49 8b 7d 10 48 8b 71 10 4c 89 c2 e8 a8
9c 23 00 45 85 f6 74 54 41 8d 46 ff 31 c9 48 8d 34 c5 08 00 00 00 31
c0 eb 15 90 f0 48 ff 42 30 49 89 14 04 ff c1 48 83 c0 08 48 39 f0 74
24 49
[ 2384.970401] RIP  [8115a250] dup_fd+0x160/0x2e0
[ 2384.970401]  RSP 88049fef1d90
[ 2384.970401] CR2: 0001002f
[ 2385.131572] ---[ end trace 12628ad63724505e ]---
[ 2385.550858] device vmEtap18 left promiscuous mode
[ 2385.953882] device vmEtap19 left promiscuous mode
[ 2386.318553] device vmEtap20 left promiscuous mode
[ 2386.714127] device vmEtap21 left promiscuous mode
[ 2387.131308] device vmEtap22 left promiscuous mode
[ 2387.498609] device vmEtap23 left promiscuous mode
[ 2397.885344] BUG: unable to handle kernel paging request at 0001003b
[ 2397.885596] IP: [811a3654] tid_fd_revalidate+0x84/0x1a0
[ 2397.885804] PGD 4ce59d067 PUD 0
[ 2397.885950] Oops:  [#5] SMP
[ 2397.886097] CPU 3
[ 2397.886163] Modules linked in: kvm_intel kvm
[ 2397.886408]
[ 2397.886425] Pid: 25760, comm: netstat Tainted: G  D W
3.5.0-rc7 #4 Dell Inc. PowerEdge M600/0MY736
[ 2397.886747] RIP: 0010:[811a3654]  [811a3654]
tid_fd_revalidate+0x84/0x1a0
[ 2397.887019] RSP: 0018:88040ca4bd78  EFLAGS: 00010206
[ 2397.887187] RAX: 8807f1f26ec0 RBX: 8807d918e928 RCX: 007d
[ 2397.887392] RDX:  RSI:  RDI: 8807f1f26ec0
[ 2397.887596] RBP: 88040ca4bd98 R08: 88083fcd6b30 R09: 8119fef0
[ 2397.887801] R10:  R11: 0202 R12: 8807d93cd3c0
[ 2397.891771] R13: 8807c1366000 R14: 8807cf615038 R15: 88040ca4be08
[ 2397.891771] FS:  7f029b1b5700() GS:88083fcc()
knlGS:
[ 2397.891771] CS:  0010 DS:  ES:  CR0: 80050033
[ 2397.891771] CR2: 0001003b CR3: 0004d0fd9000 CR4: 07e0
[ 2397.891771] DR0: 0001 DR1: 0002 DR2: 0001
[ 2397.891771] DR3: 000a DR6: 0ff0 DR7: 0400
[ 2397.891771] Process netstat (pid: 25760, threadinfo
88040ca4a000, task 88065897f000)
[ 2397.891771] Stack:
[ 2397.891771]  8807d93cd3c0 0007 8807d918e928
8807db8db240
[ 2397.891771]  88040ca4bdc8 811a3880 ff0a0210
0001
[ 2397.963460]  88040ca4be98 8807d6dcb600 88040ca4be48
811a4cd6
[ 2397.963460] Call Trace:
[ 2397.963460

Re: oops in kernel ( 3.4.x - 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello;
I can reproduce this problem on five differents servers,
I can try a memtest86.
regards,
Nicolas Prochazka.
complete dump :

 596.322369] BUG: unable to handle kernel paging request at 0001003b
[  596.322622] IP: [811a3654] tid_fd_revalidate+0x84/0x1a0
[  596.322828] PGD 7d6c20067 PUD 0
[  596.322972] Oops:  [#1] SMP
[  596.323115] CPU 3
[  596.323181] Modules linked in: kvm_intel kvm
[  596.323422]
[  596.323435] Pid: 28353, comm: netstat Not tainted 3.5.0-rc7 #4 Dell
Inc. PowerEdge M600/0MY736
[  596.323745] RIP: 0010:[811a3654]  [811a3654]
tid_fd_revalidate+0x84/0x1a0
[  596.329940] RSP: 0018:880658ab9d78  EFLAGS: 00010206
[  596.330010] RAX: 8807f1195340 RBX: 8807d91bdd20 RCX: 007d
[  596.330010] RDX:  RSI:  RDI: 8807f1195340
[  596.330010] RBP: 880658ab9d98 R08: 88083fcd6b30 R09: 8119fef0
[  596.330010] R10:  R11: 0202 R12: 8805eea4d480
[  596.330010] R13: 8807d6f06000 R14: 8807f0fa1038 R15: 880658ab9e08
[  596.330010] FS:  7f3414807700() GS:88083fcc()
knlGS:
[  596.330010] CS:  0010 DS:  ES:  CR0: 80050033
[  596.330010] CR2: 0001003b CR3: 00069bd85000 CR4: 07e0
[  596.330010] DR0: 0001 DR1: 0002 DR2: 0001
[  596.330010] DR3: 000a DR6: 0ff0 DR7: 0400
[  596.330010] Process netstat (pid: 28353, threadinfo
880658ab8000, task 88074a897000)
[  596.330010] Stack:
[  596.330010]  8805eea4d480 0007 8807d91bdd20
8807dbbbac00
[  596.330010]  880658ab9dc8 811a3880 ff0a0210
0001
[  596.330010]  880658ab9e98 880593e3ac00 880658ab9e48
811a4cd6
[  596.330010] Call Trace:
[  596.330010]  [811a3880] proc_fd_instantiate+0x80/0xa0
[  596.330010]  [811a4cd6] proc_fill_cache+0x126/0x150
[  596.330010]  [811a3800] ? proc_fdinfo_instantiate+0x90/0x90
[  596.330010]  [811505a0] ? filldir64+0xe0/0xe0
[  596.330010]  [811a5006] proc_readfd_common+0xf6/0x1c0
[  596.330010]  [811a3800] ? proc_fdinfo_instantiate+0x90/0x90
[  596.330010]  [811505a0] ? filldir64+0xe0/0xe0
[  596.330010]  [811a5105] proc_readfd+0x15/0x20
[  596.330010]  [811507c0] vfs_readdir+0xa0/0xc0
[  596.330010]  [811505a0] ? filldir64+0xe0/0xe0
[  596.330010]  [8115096d] sys_getdents+0x8d/0x100
[  596.330010]  [81ae9c29] system_call_fastpath+0x16/0x1b
[  596.531217] Code: b8 00 00 00 48 8b 50 08 44 3b 32 0f 83 9e 00 00
00 45 89 f6 49 c1 e6 03 4c 03 72 08 49 8b 16 48 85 d2 0f 84 87 00 00
00 48 89 c7 44 8b 62 3c e8 13 29 ea ff 4c 89 ef e8 4b df ff ff 85 c0
0f 84
[  596.531217] RIP  [811a3654] tid_fd_revalidate+0x84/0x1a0
[  596.531217]  RSP 880658ab9d78
[  596.531217] CR2: 0001003b
[  596.533373] ---[ end trace 12628ad63724505a ]---
[  620.908188] device vmEtap5 entered promiscuous mode
[  632.625058] device vmEtap22 entered promiscuous mode
[  637.628184] device vmEtap4 entered promiscuous mode
[  647.651842] device vmEtap6 entered promiscuous mode
[  869.373622] device vmEtap7 entered promiscuous mode
[  879.418886] device vmEtap8 entered promiscuous mode
[  884.422364] device vmEtap9 entered promiscuous mode
[  889.487014] device vmEtap10 entered promiscuous mode
[  898.926970] device vmEtap11 entered promiscuous mode
[  902.600030] hrtimer: interrupt took 23049 ns
[  909.244532] device vmEtap12 entered promiscuous mode
[  919.208239] device vmEtap13 entered promiscuous mode
[  929.798012] device vmEtap14 entered promiscuous mode
[  939.575998] device vmEtap15 entered promiscuous mode
[  949.673050] device vmEtap16 entered promiscuous mode
[  959.879484] device vmEtap17 entered promiscuous mode
[  970.117849] device vmEtap18 entered promiscuous mode
[  980.157065] device vmEtap19 entered promiscuous mode
[  990.493721] device vmEtap20 entered promiscuous mode
[ 1000.683323] device vmEtap21 entered promiscuous mode
[ 1010.820146] device vmEtap23 entered promiscuous mode
[ 1179.360788] device vmEtap4 left promiscuous mode
[ 1179.801638] device vmEtap5 left promiscuous mode
[ 1180.297567] device vmEtap6 left promiscuous mode
[ 1180.774054] device vmEtap7 left promiscuous mode
[ 1181.170919] device vmEtap8 left promiscuous mode
[ 1181.631908] device vmEtap9 left promiscuous mode
[ 1182.116042] device vmEtap10 left promiscuous mode
[ 1182.511330] device vmEtap11 left promiscuous mode
[ 1182.929594] device vmEtap12 left promiscuous mode
[ 1183.329183] device vmEtap13 left promiscuous mode
[ 1183.720130] device vmEtap14 left promiscuous mode
[ 1184.288507] device vmEtap15 left promiscuous mode
[ 1184.679455] device vmEtap16 left promiscuous mode
[ 1185.045020] device vmEtap17 left promiscuous mode
[ 1185.410966] device vmEtap18 left promiscuous mode

Re: oops in kernel ( 3.4.x - 3.5rc )

2012-07-20 Thread nicolas prochazka
Hello
the problem is occured with  :
- linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4,
but i can if you want
- linux kernel 3.5rc6 rc7  / do not test with other rc.

the problem is not occured with :
linux kernel 3.3.4 / 3.3.8

These servers are used for :
- starting a lot of virtual machine with qemu-kvm ( ~ 40 )   ( lot of
select i think)
- do a lot of network tests with openvswitch

I can test a kernel 3.4.x before and after a commit id (?) to find a regression.

Regards,
Nicolas.


2012/7/20 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com:
 On Fri, Jul 20, 2012 at 09:21:53AM -0400, Dave Jones wrote:
 On Fri, Jul 20, 2012 at 11:56:06AM +0200, nicolas prochazka wrote:

   [ 2384.900061] BUG: unable to handle kernel paging request at 
 0001002f

 That '1' looks like a random bit flip. Try running memtest86.


 Looks more a 32-bit value of 1 followed by a 32-bit value of 0x2f. Most
 likely a pointer to some other piece of a struct. However, taking a look
 at fs/files.c code, nothing seems suspicious.

 Nicolas, it wasn't clear to me if you had problems with 3.4 too. There
 has been some changes in fs/files.c on 3.4-rc1 in the piece of code
 where you hit the problem.

 What does your system exercise? Any chance you are using a lot of
 select, which has also been changed in those same patches to fs/files.c?

 Regards.
 Cascardo.


   [ 2384.910010] Pid: 23838, comm: queue.sh Tainted: G  D W

 This wasn't the first problem either.

   [ 2397.885344] BUG: unable to handle kernel paging request at 
 0001003b

 Looks like the same flipped bit.

   Dave

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops in kernel ( 3.4.x - 3.5rc )

2012-07-20 Thread nicolas prochazka
Well done
1fd36adcd98c14d2fd97f545293c488775cb2823  :  the bug occurs   ( cf dump )
1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 :  the bug not occurs

Regards,
Nicolas Prochazka.

dump  / 1fd36adcd98c14d2fd97f545293c488775cb2823
lloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
VMtap: no IPv6 routers present
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 71 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 96 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 110 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
brE: no IPv6 routers present
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 121 not NULL!
alloc_fd: slot 142 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 153 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 70 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 102 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 106 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 106 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 36 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 68 not NULL!
alloc_fd: slot 100 not NULL!
alloc_fd: slot 100 not NULL!
[ cut here ]
kernel BUG at fs/open.c:873!
invalid opcode:  [#1] SMP
CPU 0
Modules linked in: kvm_intel kvm

then BUG paging request as usual


2012/7/20 Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com:
 On Fri, Jul 20, 2012 at 10:52:40PM +0200, nicolas prochazka wrote:
 Hello
 the problem is occured with  :
 - linux kernel 3.4.5i do not test with 3.4.0 / 1 / 2 / 3 / 4,
 but i can if you want
 - linux kernel 3.5rc6 rc7  / do not test with other rc.

 the problem is not occured with :
 linux kernel 3.3.4 / 3.3.8

 These servers are used for :
 - starting a lot of virtual machine with qemu-kvm ( ~ 40 )   ( lot of
 select i think)
 - do a lot of network tests with openvswitch

 I can test a kernel 3.4.x before and after a commit id (?) to find a 
 regression