Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-14 Thread Paul E. McKenney
On Thu, Mar 14, 2013 at 04:00:54PM -0400, Dave Jones wrote: > On Sat, Mar 09, 2013 at 07:51:46AM -0800, Paul E. McKenney wrote: > > On Sat, Mar 09, 2013 at 04:01:41PM +0800, Li Zefan wrote: > > > > [ . . . ] > > > > > > This way, "ptr" is executed exactly once, and the check and the > > > >

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-14 Thread Dave Jones
On Sat, Mar 09, 2013 at 07:51:46AM -0800, Paul E. McKenney wrote: > On Sat, Mar 09, 2013 at 04:01:41PM +0800, Li Zefan wrote: > > [ . . . ] > > > > This way, "ptr" is executed exactly once, and the check and the > > > hlist_entry() are both using the same value. > > > > I just played wit

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-09 Thread Paul E. McKenney
On Sat, Mar 09, 2013 at 04:01:41PM +0800, Li Zefan wrote: [ . . . ] > >> hlist_first_rcu() doesn't have any side-effects, it doesn't modify the > >> list whatsoever, > >> so the only thing that can change is 'head'. Why is it allowed to change > >> if the list > >> is protected by RCU? > > > >

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-09 Thread Li Zefan
>> Looks like the hlist change is probably the issue, though it specifically >> uses: >> >> #define hlist_entry_safe(ptr, type, member) \ >> (ptr) ? hlist_entry(ptr, type, member) : NULL >> >> I'm still looking at the code in question and it's assembly, but I c

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Paul E. McKenney
On Thu, Mar 07, 2013 at 01:14:10PM -0500, Sasha Levin wrote: > On 03/07/2013 01:05 PM, ebied...@xmission.com wrote: > > Sasha Levin writes: > > > >> On 03/07/2013 12:46 PM, Eric Dumazet wrote: > >>> On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > >>> > Looks like the hlist change is

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Sasha Levin
On 03/07/2013 01:21 PM, ebied...@xmission.com wrote: > Sasha Levin writes: > >> On 03/07/2013 01:05 PM, ebied...@xmission.com wrote: >>> Sasha Levin writes: >>> On 03/07/2013 12:46 PM, Eric Dumazet wrote: > On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > >> Looks like th

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Eric W. Biederman
Sasha Levin writes: > On 03/07/2013 01:05 PM, ebied...@xmission.com wrote: >> Sasha Levin writes: >> >>> On 03/07/2013 12:46 PM, Eric Dumazet wrote: On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > Looks like the hlist change is probably the issue, though it specifically >>

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 13:14 -0500, Sasha Levin wrote: > Okay, I'm even more confused now. > > The expression in question is: > > hlist_entry_safe(rcu_dereference_bh(hlist_first_rcu(head))) > > You're saying that "rcu_dereference_bh(hlist_first_rcu(head))" can change > between > the two e

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Paul E. McKenney
On Thu, Mar 07, 2013 at 10:05:34AM -0800, Eric W. Biederman wrote: > Sasha Levin writes: > > > On 03/07/2013 12:46 PM, Eric Dumazet wrote: > >> On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > >> > >>> Looks like the hlist change is probably the issue, though it specifically > >>> uses: >

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Sasha Levin
On 03/07/2013 01:05 PM, ebied...@xmission.com wrote: > Sasha Levin writes: > >> On 03/07/2013 12:46 PM, Eric Dumazet wrote: >>> On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: >>> Looks like the hlist change is probably the issue, though it specifically uses: #define

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Eric W. Biederman
Sasha Levin writes: > On 03/07/2013 12:46 PM, Eric Dumazet wrote: >> On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: >> >>> Looks like the hlist change is probably the issue, though it specifically >>> uses: >>> >>> #define hlist_entry_safe(ptr, type, member) \ >>> (ptr) ?

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Paul E. McKenney
On Thu, Mar 07, 2013 at 12:50:47PM -0500, Sasha Levin wrote: > On 03/07/2013 12:46 PM, Eric Dumazet wrote: > > On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > > > >> Looks like the hlist change is probably the issue, though it specifically > >> uses: > >> > >>#define hlist_entry_safe(p

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Sasha Levin
On 03/07/2013 12:46 PM, Eric Dumazet wrote: > On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > >> Looks like the hlist change is probably the issue, though it specifically >> uses: >> >> #define hlist_entry_safe(ptr, type, member) \ >> (ptr) ? hlist_entry(ptr, type, member

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 12:36 -0500, Sasha Levin wrote: > Looks like the hlist change is probably the issue, though it specifically > uses: > > #define hlist_entry_safe(ptr, type, member) \ > (ptr) ? hlist_entry(ptr, type, member) : NULL > > I'm still looking at the code in que

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Sasha Levin
On 03/07/2013 04:59 AM, ebied...@xmission.com wrote: > Li Zefan writes: > >> Cc: sasha.le...@oracle.com >> Cc: "Eric W. Biederman" >> Cc: container >> >> This is a second report... and the same address: 0xfff0 > > Actually this is the third report I have seen with that address, and

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Eric W. Biederman
Li Zefan writes: > Cc: sasha.le...@oracle.com > Cc: "Eric W. Biederman" > Cc: container > > This is a second report... and the same address: 0xfff0 Actually this is the third report I have seen with that address, and the others were on x86_64. The obvious answer is that there is s

Re: 3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread Li Zefan
Cc: sasha.le...@oracle.com Cc: "Eric W. Biederman" Cc: container This is a second report... and the same address: 0xfff0 On 2013/3/7 17:37, CAI Qian wrote: > Just came across this during LTP run on a ppc64 system. Still trying to > reproduce and possible bisect, but want to give an

3.9-rc1 NULL pointer crash at find_pid_ns

2013-03-07 Thread CAI Qian
Just came across this during LTP run on a ppc64 system. Still trying to reproduce and possible bisect, but want to give an early head-up to see if anyone see anything obvious. CAI Qian [ 6476.040024] Unable to handle kernel paging request for data at address 0xfff0 [ 6476.040051] Fa