subject:"panic with tcp timers"

Re: panic with tcp timers

2016-07-21 Thread Hans Petter Selasky


On 07/21/16 09:54, Julien Charbon wrote:


 Hi,

On 7/14/16 11:02 PM, Larry Rosenman wrote:

On 2016-07-14 12:01, Julien Charbon wrote:

On 6/20/16 11:55 AM, Julien Charbon wrote:

On 6/20/16 9:39 AM, Gleb Smirnoff wrote:

On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
J> > Comparing stable/10 and head, I see two changes that could
J> > affect that:
J> >
J> > - callout_async_drain
J> > - switch to READ lock for inp info in tcp timers
J> >
J> > That's why you are in To, Julien and Hans :)
J> >
J> > We continue investigating, and I will keep you updated.
J> > However, any help is welcome. I can share cores.

Now, spending some time with cores and adding a bunch of
extra CTRs, I have a sequence of events that lead to the
panic. In short, the bug is in the callout system. It seems
to be not relevant to the callout_async_drain, at least for
now. The transition to READ lock unmasked the problem, that's
why NetflixBSD 10 doesn't panic.

The panic requires heavy contention on the TCP info lock.

[CPU 1] the callout fires, tcp_timer_keep entered
[CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
[CPU 2] schedules the callout
[CPU 2] tcp_discardcb called
[CPU 2] callout successfully canceled
[CPU 2] tcpcb freed
[CPU 1] unblocks... panic

When the lock was WLOCK, all contenders were resumed in a
sequence they came to the lock. Now, that they are readers,
once the lock is released, readers are resumed in a "random"
order, and this allows tcp_discardcb to go before the old
running callout, and this unmasks the panic.


 Highly interesting.  I should be able to reproduce that (will be useful
for testing the corresponding fix).


 Finally, I was able to reproduce it (without glebius fix).   The trick
was to really lower TCP keep timer expiration:

$ sysctl -a | grep tcp.keep
net.inet.tcp.keepidle: 720
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
$ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
Password:
net.inet.tcp.keepidle: 720 -> 10
net.inet.tcp.keepintvl: 75000 -> 50
net.inet.tcp.keepinit: 75000 -> 10

 Note: It will certainly close all your ssh connections to the tested
server.

 Now I will test in order:

#1. glebius fix
https://svnweb.freebsd.org/base?view=revision=302350

#2. rss extra fix
https://reviews.freebsd.org/D7135

#3. rrs TCP Timer cleanup
https://reviews.freebsd.org/D7136


please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884


 My tests result so far:

#1. r302350:  First glebius TCP timer fix:  No more TCP timer kernel
panic during 48h under 200k TCP query per second load.

 Sadly I was unable to reproduce the issue described here:

panic: bogus refcnt 0 on lle 0xf80004608c00
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884

#2. r303098:  Got all kernel callout changes since r302350, (updates on
callout code are indeed always full of surprises):
https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?view=log=303098

 No kernel panic either.

 Still to test:

#3. rss extra fix (if still relevant now)
https://reviews.freebsd.org/D7135

#4. rrs TCP Timer cleanup:
https://reviews.freebsd.org/D7136

 My 2 cents.



Hi,

You should also check for memory leaks using "vmstat -m" .

--HPS

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-07-21 Thread Julien Charbon

 Hi,

On 7/14/16 11:02 PM, Larry Rosenman wrote:
> On 2016-07-14 12:01, Julien Charbon wrote:
>> On 6/20/16 11:55 AM, Julien Charbon wrote:
>>> On 6/20/16 9:39 AM, Gleb Smirnoff wrote:
 On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
 J> > Comparing stable/10 and head, I see two changes that could
 J> > affect that:
 J> >
 J> > - callout_async_drain
 J> > - switch to READ lock for inp info in tcp timers
 J> >
 J> > That's why you are in To, Julien and Hans :)
 J> >
 J> > We continue investigating, and I will keep you updated.
 J> > However, any help is welcome. I can share cores.

 Now, spending some time with cores and adding a bunch of
 extra CTRs, I have a sequence of events that lead to the
 panic. In short, the bug is in the callout system. It seems
 to be not relevant to the callout_async_drain, at least for
 now. The transition to READ lock unmasked the problem, that's
 why NetflixBSD 10 doesn't panic.

 The panic requires heavy contention on the TCP info lock.

 [CPU 1] the callout fires, tcp_timer_keep entered
 [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
 [CPU 2] schedules the callout
 [CPU 2] tcp_discardcb called
 [CPU 2] callout successfully canceled
 [CPU 2] tcpcb freed
 [CPU 1] unblocks... panic

 When the lock was WLOCK, all contenders were resumed in a
 sequence they came to the lock. Now, that they are readers,
 once the lock is released, readers are resumed in a "random"
 order, and this allows tcp_discardcb to go before the old
 running callout, and this unmasks the panic.
>>>
>>>  Highly interesting.  I should be able to reproduce that (will be useful
>>> for testing the corresponding fix).
>>
>>  Finally, I was able to reproduce it (without glebius fix).   The trick
>> was to really lower TCP keep timer expiration:
>>
>> $ sysctl -a | grep tcp.keep
>> net.inet.tcp.keepidle: 720
>> net.inet.tcp.keepintvl: 75000
>> net.inet.tcp.keepinit: 75000
>> net.inet.tcp.keepcnt: 8
>> $ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
>> net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
>> Password:
>> net.inet.tcp.keepidle: 720 -> 10
>> net.inet.tcp.keepintvl: 75000 -> 50
>> net.inet.tcp.keepinit: 75000 -> 10
>>
>>  Note: It will certainly close all your ssh connections to the tested
>> server.
>>
>>  Now I will test in order:
>>
>> #1. glebius fix
>> https://svnweb.freebsd.org/base?view=revision=302350
>>
>> #2. rss extra fix
>> https://reviews.freebsd.org/D7135
>>
>> #3. rrs TCP Timer cleanup
>> https://reviews.freebsd.org/D7136
> 
> please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884

 My tests result so far:

#1. r302350:  First glebius TCP timer fix:  No more TCP timer kernel
panic during 48h under 200k TCP query per second load.

 Sadly I was unable to reproduce the issue described here:

panic: bogus refcnt 0 on lle 0xf80004608c00
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884

#2. r303098:  Got all kernel callout changes since r302350, (updates on
callout code are indeed always full of surprises):
https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?view=log=303098

 No kernel panic either.

 Still to test:

#3. rss extra fix (if still relevant now)
https://reviews.freebsd.org/D7135

#4. rrs TCP Timer cleanup:
https://reviews.freebsd.org/D7136

 My 2 cents.

--
Julien

signature.asc
Description: OpenPGP digital signature

Re: panic with tcp timers

2016-07-14 Thread Larry Rosenman


On 2016-07-14 12:01, Julien Charbon wrote:

Hi,

On 6/20/16 11:55 AM, Julien Charbon wrote:

On 6/20/16 9:39 AM, Gleb Smirnoff wrote:

On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
J> > Comparing stable/10 and head, I see two changes that could
J> > affect that:
J> >
J> > - callout_async_drain
J> > - switch to READ lock for inp info in tcp timers
J> >
J> > That's why you are in To, Julien and Hans :)
J> >
J> > We continue investigating, and I will keep you updated.
J> > However, any help is welcome. I can share cores.

Now, spending some time with cores and adding a bunch of
extra CTRs, I have a sequence of events that lead to the
panic. In short, the bug is in the callout system. It seems
to be not relevant to the callout_async_drain, at least for
now. The transition to READ lock unmasked the problem, that's
why NetflixBSD 10 doesn't panic.

The panic requires heavy contention on the TCP info lock.

[CPU 1] the callout fires, tcp_timer_keep entered
[CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
[CPU 2] schedules the callout
[CPU 2] tcp_discardcb called
[CPU 2] callout successfully canceled
[CPU 2] tcpcb freed
[CPU 1] unblocks... panic

When the lock was WLOCK, all contenders were resumed in a
sequence they came to the lock. Now, that they are readers,
once the lock is released, readers are resumed in a "random"
order, and this allows tcp_discardcb to go before the old
running callout, and this unmasks the panic.


 Highly interesting.  I should be able to reproduce that (will be 
useful

for testing the corresponding fix).


 Finally, I was able to reproduce it (without glebius fix).   The trick
was to really lower TCP keep timer expiration:

$ sysctl -a | grep tcp.keep
net.inet.tcp.keepidle: 720
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
$ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
Password:
net.inet.tcp.keepidle: 720 -> 10
net.inet.tcp.keepintvl: 75000 -> 50
net.inet.tcp.keepinit: 75000 -> 10

 Note: It will certainly close all your ssh connections to the tested
server.

 Now I will test in order:

#1. glebius fix
https://svnweb.freebsd.org/base?view=revision=302350

#2. rss extra fix
https://reviews.freebsd.org/D7135

#3. rrs TCP Timer cleanup
https://reviews.freebsd.org/D7136

 My panic for reference:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 10; apic id = 28
[root@atlas-dl360-4 ~]# instruction pointer = 
0x20:0x80c346f1

stack pointer   = 0x28:0xfe1f29b848b0
frame pointer   = 0x28:0xfe1f29b848e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (4))
trap number = 9
panic: general protection fault
cpuid = 10
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe1f29b844a0
vpanic() at vpanic+0x182/frame 0xfe1f29b84520
panic() at panic+0x43/frame 0xfe1f29b84580
trap_fatal() at trap_fatal+0x351/frame 0xfe1f29b845e0
trap() at trap+0x820/frame 0xfe1f29b847f0
calltrap() at calltrap+0x8/frame 0xfe1f29b847f0
--- trap 0x9, rip = 0x80c346f1, rsp = 0xfe1f29b848c0, rbp =
0xfe1f29b848e0 ---
tcp_timer_keep() at tcp_timer_keep+0x51/frame 0xfe1f29b848e0
softclock_call_cc() at softclock_call_cc+0x19c/frame 0xfe1f29b849c0
softclock() at softclock+0x47/frame 0xfe1f29b849e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame
0xfe1f29b84a20
ithread_loop() at ithread_loop+0xa6/frame 0xfe1f29b84a70
fork_exit() at fork_exit+0x84/frame 0xfe1f29b84ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe1f29b84ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

--
Julien



please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-07-14 Thread Julien Charbon


 Hi,

On 6/20/16 11:55 AM, Julien Charbon wrote:
> On 6/20/16 9:39 AM, Gleb Smirnoff wrote:
>> On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
>> J> > Comparing stable/10 and head, I see two changes that could
>> J> > affect that:
>> J> > 
>> J> > - callout_async_drain
>> J> > - switch to READ lock for inp info in tcp timers
>> J> > 
>> J> > That's why you are in To, Julien and Hans :)
>> J> > 
>> J> > We continue investigating, and I will keep you updated.
>> J> > However, any help is welcome. I can share cores.
>>
>> Now, spending some time with cores and adding a bunch of
>> extra CTRs, I have a sequence of events that lead to the
>> panic. In short, the bug is in the callout system. It seems
>> to be not relevant to the callout_async_drain, at least for
>> now. The transition to READ lock unmasked the problem, that's
>> why NetflixBSD 10 doesn't panic.
>>
>> The panic requires heavy contention on the TCP info lock.
>>
>> [CPU 1] the callout fires, tcp_timer_keep entered
>> [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
>> [CPU 2] schedules the callout
>> [CPU 2] tcp_discardcb called
>> [CPU 2] callout successfully canceled
>> [CPU 2] tcpcb freed
>> [CPU 1] unblocks... panic
>>
>> When the lock was WLOCK, all contenders were resumed in a
>> sequence they came to the lock. Now, that they are readers,
>> once the lock is released, readers are resumed in a "random"
>> order, and this allows tcp_discardcb to go before the old
>> running callout, and this unmasks the panic.
> 
>  Highly interesting.  I should be able to reproduce that (will be useful
> for testing the corresponding fix).

 Finally, I was able to reproduce it (without glebius fix).   The trick
was to really lower TCP keep timer expiration:

$ sysctl -a | grep tcp.keep
net.inet.tcp.keepidle: 720
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
$ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
Password:
net.inet.tcp.keepidle: 720 -> 10
net.inet.tcp.keepintvl: 75000 -> 50
net.inet.tcp.keepinit: 75000 -> 10

 Note: It will certainly close all your ssh connections to the tested
server.

 Now I will test in order:

#1. glebius fix
https://svnweb.freebsd.org/base?view=revision=302350

#2. rss extra fix
https://reviews.freebsd.org/D7135

#3. rrs TCP Timer cleanup
https://reviews.freebsd.org/D7136

 My panic for reference:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 10; apic id = 28
[root@atlas-dl360-4 ~]# instruction pointer = 0x20:0x80c346f1
stack pointer   = 0x28:0xfe1f29b848b0
frame pointer   = 0x28:0xfe1f29b848e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (4))
trap number = 9
panic: general protection fault
cpuid = 10
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe1f29b844a0
vpanic() at vpanic+0x182/frame 0xfe1f29b84520
panic() at panic+0x43/frame 0xfe1f29b84580
trap_fatal() at trap_fatal+0x351/frame 0xfe1f29b845e0
trap() at trap+0x820/frame 0xfe1f29b847f0
calltrap() at calltrap+0x8/frame 0xfe1f29b847f0
--- trap 0x9, rip = 0x80c346f1, rsp = 0xfe1f29b848c0, rbp =
0xfe1f29b848e0 ---
tcp_timer_keep() at tcp_timer_keep+0x51/frame 0xfe1f29b848e0
softclock_call_cc() at softclock_call_cc+0x19c/frame 0xfe1f29b849c0
softclock() at softclock+0x47/frame 0xfe1f29b849e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame
0xfe1f29b84a20
ithread_loop() at ithread_loop+0xa6/frame 0xfe1f29b84a70
fork_exit() at fork_exit+0x84/frame 0xfe1f29b84ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe1f29b84ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

--
Julien




signature.asc
Description: OpenPGP digital signature

Re: EBR fix for life cycle races was Re: panic with tcp timers

2016-06-30 Thread Matthew Macy


  On Tue, 28 Jun 2016 23:19:45 -0700 Matthew Macy  wrote 
 
 >  
 >  
 >  
 >   On Tue, 28 Jun 2016 15:51:57 -0700 K. Macy  wrote 
 >   
 >  > On Tue, Jun 28, 2016 at 10:51 AM, Matthew Macy  wrote: 
 >  
 >  > > You guys should really look at Samy Bahra's epoch based reclamation. I 
 > solved a similar problem in drm/linuxkpi using it.  
 >  >   
 >  > The point being that this is a bug in the TCP life cycle handling  
 >  > _not_ in callouts. Churning the callout interface is not the best /  
 >  > only solution.  
 >  > -M  
 >   
 > Please see see D7017/D7018 as an example for an ultimately more robust 
 > solution than continued fiddling with the callout interface. 
 >  
 > https://reviews.freebsd.org/D7018 
 
I realized that this shortens the race but still leaves one open from the time 
the callout lock is dropped to the time the epoch begins. I have a proposed fix 
to make read locks never block and to essentially close the race window. 

The next issue that comes up is that synchronize is called too often. I'll talk 
to Samy about it in a few hours and come up with a better design.

-M



 
 >  
 >  > >   On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon 
 >  wrote   
 >  > >  >  
 >  > >  >  Hi Randall,  
 >  > >  >  
 >  > >  > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote:  
 >  > >  > > Ok  
 >  > >  > >  
 >  > >  > > Lets try this again with my source changed to my @freebsd.net :-)  
 >  > >  > >  
 >  > >  > > Now I am also attaching a patch for you Gleb, this will take some 
 > poking to  
 >  > >  > > get in to your NF-head since it incorporates some changes we made 
 > earlier.  
 >  > >  > >  
 >  > >  > > I think this will fix the problem.. i.e. dealing with two locks in 
 > the callout system (which it was  
 >  > >  > > never meant to have done)..  
 >  > >  > >  
 >  > >  > > Note we probably can move the code to use the callout lock init 
 > now.. but lets see if this works  
 >  > >  > > on your setup on c096 and if so we can think about doing that.  
 >  > >  >  
 >  > >  >  Thanks for proposing a patch.  I believe your patch will work with  
 >  > >  > callout lock init, but not without:  You still have a use-after-free 
 >  
 >  > >  > issue on the tcpcb without callout lock init.  
 >  > >  >  
 >  > >  >  The case being subtle as usual, let me try to describe that could 
 > happen:  
 >  > >  >  
 >  > >  >  With your patch we have:  
 >  > >  >  
 >  > >  > void  
 >  > >  > tcp_timer_keep(void *xtp)  
 >  > >  > {  
 >  > >  > struct tcpcb *tp = xtp;  
 >  > >  > struct tcptemp *t_template;  
 >  > >  > struct inpcb *inp;  
 >  > >  > CURVNET_SET(tp->t_vnet);  
 >  > >  > #ifdef TCPDEBUG  
 >  > >  > int ostate;  
 >  > >  >  
 >  > >  > ostate = tp->t_state;  
 >  > >  > #endif  
 >  > >  > inp = tp->t_inpcb;  
 >  > >  > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", 
 > __func__,  
 >  > >  > tp));  
 >  > >  > INP_WLOCK(inp);  
 >  > >  > if (callout_pending(>t_timers->tt_keep) ### Use after 
 > free  
 >  > >  > of tp here  
 >  > >  > !callout_active(>t_timers->tt_keep)) {  
 >  > >  > INP_WUNLOCK(inp);  
 >  > >  > CURVNET_RESTORE();  
 >  > >  > return;  
 >  > >  > }  
 >  > >  > ...  
 >  > >  >  
 >  > >  >  The use-after-free scenario:  
 >  > >  >  
 >  > >  > [CPU 1] the callout fires, tcp_timer_keep entered  
 >  > >  > [CPU 1] blocks on INP_WLOCK(inp);  
 >  > >  > [CPU 2] schedules tcp_timer_keep with callout_reset()  
 >  > >  > [CPU 2] tcp_discardcb called  
 >  > >  > [CPU 2] tcp_timer_keep callout successfully canceled  
 >  > >  > [CPU 2] tcpcb freed  
 >  > >  > [CPU 1] unblocks, the tcpcb is used  
 >  > >  >  
 >  > >  >  Then the tcpcb will used just after being freed...  Might also 
 > crash or  
 >  > >  > not depending in the case.  
 >  > >  >  
 >  > >  >  Extra notes:  
 >  > >  >  
 >  > >  >  o The invariant I see here is:  The "callout successfully canceled" 
 >  
 >  > >  > step should never happen when "the callout is currently being 
 > executed".  
 >  > >  >  
 >  > >  >  o Solutions I see to enforce this invariant:  
 >  > >  >  
 >  > >  >  - First solution:  Use callout lock init with inp lock, your patch  
 >  > >  > seems to permit that now.  
 >  > >  >  
 >  > >  >  - Second solution:  Change callout_async_drain() behavior:  It can  
 >  > >  > return 0 (fail) when the callout is currently being executed (no 
 > matter  
 >  > >  > what).  
 >  > >  >  
 >  > >  >  - Third solution:  Don't trust callout_async_drain(callout) return  
 >  > >  > value of 1 (success) if the previous call of callout_reset(callout)  
 >  > >  > returned 0 (fail).  That was the exact purpose of r284261 change, 
 > but  
 >  > >  > this solution is also step backward in

EBR fix for life cycle races was Re: panic with tcp timers

2016-06-29 Thread Matthew Macy




  On Tue, 28 Jun 2016 15:51:57 -0700 K. Macy  wrote  
 > On Tue, Jun 28, 2016 at 10:51 AM, Matthew Macy  wrote: 
 > > You guys should really look at Samy Bahra's epoch based reclamation. I 
 > > solved a similar problem in drm/linuxkpi using it. 
 >  
 > The point being that this is a bug in the TCP life cycle handling 
 > _not_ in callouts. Churning the callout interface is not the best / 
 > only solution. 
 > -M 
 
Please see see D7017/D7018 as an example for an ultimately more robust solution 
than continued fiddling with the callout interface.

https://reviews.freebsd.org/D7018



Cheers.

-M

 > >   On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon  
 > > wrote  
 > >  > 
 > >  >  Hi Randall, 
 > >  > 
 > >  > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote: 
 > >  > > Ok 
 > >  > > 
 > >  > > Lets try this again with my source changed to my @freebsd.net :-) 
 > >  > > 
 > >  > > Now I am also attaching a patch for you Gleb, this will take some 
 > > poking to 
 > >  > > get in to your NF-head since it incorporates some changes we made 
 > > earlier. 
 > >  > > 
 > >  > > I think this will fix the problem.. i.e. dealing with two locks in 
 > > the callout system (which it was 
 > >  > > never meant to have done).. 
 > >  > > 
 > >  > > Note we probably can move the code to use the callout lock init now.. 
 > > but lets see if this works 
 > >  > > on your setup on c096 and if so we can think about doing that. 
 > >  > 
 > >  >  Thanks for proposing a patch.  I believe your patch will work with 
 > >  > callout lock init, but not without:  You still have a use-after-free 
 > >  > issue on the tcpcb without callout lock init. 
 > >  > 
 > >  >  The case being subtle as usual, let me try to describe that could 
 > > happen: 
 > >  > 
 > >  >  With your patch we have: 
 > >  > 
 > >  > void 
 > >  > tcp_timer_keep(void *xtp) 
 > >  > { 
 > >  > struct tcpcb *tp = xtp; 
 > >  > struct tcptemp *t_template; 
 > >  > struct inpcb *inp; 
 > >  > CURVNET_SET(tp->t_vnet); 
 > >  > #ifdef TCPDEBUG 
 > >  > int ostate; 
 > >  > 
 > >  > ostate = tp->t_state; 
 > >  > #endif 
 > >  > inp = tp->t_inpcb; 
 > >  > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", 
 > > __func__, 
 > >  > tp)); 
 > >  > INP_WLOCK(inp); 
 > >  > if (callout_pending(>t_timers->tt_keep) ### Use after free 
 > >  > of tp here 
 > >  > !callout_active(>t_timers->tt_keep)) { 
 > >  > INP_WUNLOCK(inp); 
 > >  > CURVNET_RESTORE(); 
 > >  > return; 
 > >  > } 
 > >  > ... 
 > >  > 
 > >  >  The use-after-free scenario: 
 > >  > 
 > >  > [CPU 1] the callout fires, tcp_timer_keep entered 
 > >  > [CPU 1] blocks on INP_WLOCK(inp); 
 > >  > [CPU 2] schedules tcp_timer_keep with callout_reset() 
 > >  > [CPU 2] tcp_discardcb called 
 > >  > [CPU 2] tcp_timer_keep callout successfully canceled 
 > >  > [CPU 2] tcpcb freed 
 > >  > [CPU 1] unblocks, the tcpcb is used 
 > >  > 
 > >  >  Then the tcpcb will used just after being freed...  Might also crash 
 > > or 
 > >  > not depending in the case. 
 > >  > 
 > >  >  Extra notes: 
 > >  > 
 > >  >  o The invariant I see here is:  The "callout successfully canceled" 
 > >  > step should never happen when "the callout is currently being 
 > > executed". 
 > >  > 
 > >  >  o Solutions I see to enforce this invariant: 
 > >  > 
 > >  >  - First solution:  Use callout lock init with inp lock, your patch 
 > >  > seems to permit that now. 
 > >  > 
 > >  >  - Second solution:  Change callout_async_drain() behavior:  It can 
 > >  > return 0 (fail) when the callout is currently being executed (no matter 
 > >  > what). 
 > >  > 
 > >  >  - Third solution:  Don't trust callout_async_drain(callout) return 
 > >  > value of 1 (success) if the previous call of callout_reset(callout) 
 > >  > returned 0 (fail).  That was the exact purpose of r284261 change, but 
 > >  > this solution is also step backward in modernization of TCP 
 > >  > timers/callout... 
 > >  > 
 > >  > 
 > > https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261=284260=284261
 > >  
 > >  > 
 > >  >  Hopefully my description is clear enough... 
 > >  > 
 > >  > -- 
 > >  > Julien 
 > >  > 
 > >  > 
 > > 
 > > ___ 
 > > freebsd-current@freebsd.org mailing list 
 > > https://lists.freebsd.org/mailman/listinfo/freebsd-current 
 > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" 
 > ___ 
 > freebsd-current@freebsd.org mailing list 
 > https://lists.freebsd.org/mailman/listinfo/freebsd-current 
 > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" 
 > 

___
freebsd-current@freebsd.org mailing list

Re: panic with tcp timers

2016-06-28 Thread K. Macy

On Tue, Jun 28, 2016 at 10:51 AM, Matthew Macy  wrote:
> You guys should really look at Samy Bahra's epoch based reclamation. I solved 
> a similar problem in drm/linuxkpi using it.

The point being that this is a bug in the TCP life cycle handling
_not_ in callouts. Churning the callout interface is not the best /
only solution.
-M


>
>   On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon  
> wrote 
>  >
>  >  Hi Randall,
>  >
>  > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote:
>  > > Ok
>  > >
>  > > Lets try this again with my source changed to my @freebsd.net :-)
>  > >
>  > > Now I am also attaching a patch for you Gleb, this will take some poking 
> to
>  > > get in to your NF-head since it incorporates some changes we made 
> earlier.
>  > >
>  > > I think this will fix the problem.. i.e. dealing with two locks in the 
> callout system (which it was
>  > > never meant to have done)..
>  > >
>  > > Note we probably can move the code to use the callout lock init now.. 
> but lets see if this works
>  > > on your setup on c096 and if so we can think about doing that.
>  >
>  >  Thanks for proposing a patch.  I believe your patch will work with
>  > callout lock init, but not without:  You still have a use-after-free
>  > issue on the tcpcb without callout lock init.
>  >
>  >  The case being subtle as usual, let me try to describe that could happen:
>  >
>  >  With your patch we have:
>  >
>  > void
>  > tcp_timer_keep(void *xtp)
>  > {
>  > struct tcpcb *tp = xtp;
>  > struct tcptemp *t_template;
>  > struct inpcb *inp;
>  > CURVNET_SET(tp->t_vnet);
>  > #ifdef TCPDEBUG
>  > int ostate;
>  >
>  > ostate = tp->t_state;
>  > #endif
>  > inp = tp->t_inpcb;
>  > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__,
>  > tp));
>  > INP_WLOCK(inp);
>  > if (callout_pending(>t_timers->tt_keep) ### Use after free
>  > of tp here
>  > !callout_active(>t_timers->tt_keep)) {
>  > INP_WUNLOCK(inp);
>  > CURVNET_RESTORE();
>  > return;
>  > }
>  > ...
>  >
>  >  The use-after-free scenario:
>  >
>  > [CPU 1] the callout fires, tcp_timer_keep entered
>  > [CPU 1] blocks on INP_WLOCK(inp);
>  > [CPU 2] schedules tcp_timer_keep with callout_reset()
>  > [CPU 2] tcp_discardcb called
>  > [CPU 2] tcp_timer_keep callout successfully canceled
>  > [CPU 2] tcpcb freed
>  > [CPU 1] unblocks, the tcpcb is used
>  >
>  >  Then the tcpcb will used just after being freed...  Might also crash or
>  > not depending in the case.
>  >
>  >  Extra notes:
>  >
>  >  o The invariant I see here is:  The "callout successfully canceled"
>  > step should never happen when "the callout is currently being executed".
>  >
>  >  o Solutions I see to enforce this invariant:
>  >
>  >  - First solution:  Use callout lock init with inp lock, your patch
>  > seems to permit that now.
>  >
>  >  - Second solution:  Change callout_async_drain() behavior:  It can
>  > return 0 (fail) when the callout is currently being executed (no matter
>  > what).
>  >
>  >  - Third solution:  Don't trust callout_async_drain(callout) return
>  > value of 1 (success) if the previous call of callout_reset(callout)
>  > returned 0 (fail).  That was the exact purpose of r284261 change, but
>  > this solution is also step backward in modernization of TCP
>  > timers/callout...
>  >
>  > 
> https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261=284260=284261
>  >
>  >  Hopefully my description is clear enough...
>  >
>  > --
>  > Julien
>  >
>  >
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-28 Thread Matthew Macy

You guys should really look at Samy Bahra's epoch based reclamation. I solved a 
similar problem in drm/linuxkpi using it.


-M

  On Tue, 28 Jun 2016 02:58:56 -0700 Julien Charbon  
wrote  
 >  
 >  Hi Randall, 
 >  
 > On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote: 
 > > Ok 
 > >  
 > > Lets try this again with my source changed to my @freebsd.net :-) 
 > >  
 > > Now I am also attaching a patch for you Gleb, this will take some poking 
 > > to 
 > > get in to your NF-head since it incorporates some changes we made earlier. 
 > >  
 > > I think this will fix the problem.. i.e. dealing with two locks in the 
 > > callout system (which it was 
 > > never meant to have done).. 
 > >  
 > > Note we probably can move the code to use the callout lock init now.. but 
 > > lets see if this works 
 > > on your setup on c096 and if so we can think about doing that. 
 >  
 >  Thanks for proposing a patch.  I believe your patch will work with 
 > callout lock init, but not without:  You still have a use-after-free 
 > issue on the tcpcb without callout lock init. 
 >  
 >  The case being subtle as usual, let me try to describe that could happen: 
 >  
 >  With your patch we have: 
 >  
 > void 
 > tcp_timer_keep(void *xtp) 
 > { 
 > struct tcpcb *tp = xtp; 
 > struct tcptemp *t_template; 
 > struct inpcb *inp; 
 > CURVNET_SET(tp->t_vnet); 
 > #ifdef TCPDEBUG 
 > int ostate; 
 >  
 > ostate = tp->t_state; 
 > #endif 
 > inp = tp->t_inpcb; 
 > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, 
 > tp)); 
 > INP_WLOCK(inp); 
 > if (callout_pending(>t_timers->tt_keep) ### Use after free 
 > of tp here 
 > !callout_active(>t_timers->tt_keep)) { 
 > INP_WUNLOCK(inp); 
 > CURVNET_RESTORE(); 
 > return; 
 > } 
 > ... 
 >  
 >  The use-after-free scenario: 
 >  
 > [CPU 1] the callout fires, tcp_timer_keep entered 
 > [CPU 1] blocks on INP_WLOCK(inp); 
 > [CPU 2] schedules tcp_timer_keep with callout_reset() 
 > [CPU 2] tcp_discardcb called 
 > [CPU 2] tcp_timer_keep callout successfully canceled 
 > [CPU 2] tcpcb freed 
 > [CPU 1] unblocks, the tcpcb is used 
 >  
 >  Then the tcpcb will used just after being freed...  Might also crash or 
 > not depending in the case. 
 >  
 >  Extra notes: 
 >  
 >  o The invariant I see here is:  The "callout successfully canceled" 
 > step should never happen when "the callout is currently being executed". 
 >  
 >  o Solutions I see to enforce this invariant: 
 >  
 >  - First solution:  Use callout lock init with inp lock, your patch 
 > seems to permit that now. 
 >  
 >  - Second solution:  Change callout_async_drain() behavior:  It can 
 > return 0 (fail) when the callout is currently being executed (no matter 
 > what). 
 >  
 >  - Third solution:  Don't trust callout_async_drain(callout) return 
 > value of 1 (success) if the previous call of callout_reset(callout) 
 > returned 0 (fail).  That was the exact purpose of r284261 change, but 
 > this solution is also step backward in modernization of TCP 
 > timers/callout... 
 >  
 > https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261=284260=284261
 >  
 >  
 >  Hopefully my description is clear enough... 
 >  
 > -- 
 > Julien 
 >  
 > 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-28 Thread Julien Charbon


 Hi Randall,

On 6/25/16 4:41 PM, Randall Stewart via freebsd-net wrote:
> Ok
> 
> Lets try this again with my source changed to my @freebsd.net :-)
> 
> Now I am also attaching a patch for you Gleb, this will take some poking to
> get in to your NF-head since it incorporates some changes we made earlier.
> 
> I think this will fix the problem.. i.e. dealing with two locks in the 
> callout system (which it was
> never meant to have done)..
> 
> Note we probably can move the code to use the callout lock init now.. but 
> lets see if this works
> on your setup on c096 and if so we can think about doing that.

 Thanks for proposing a patch.  I believe your patch will work with
callout lock init, but not without:  You still have a use-after-free
issue on the tcpcb without callout lock init.

 The case being subtle as usual, let me try to describe that could happen:

 With your patch we have:

void
tcp_timer_keep(void *xtp)
{
struct tcpcb *tp = xtp;
struct tcptemp *t_template;
struct inpcb *inp;
CURVNET_SET(tp->t_vnet);
#ifdef TCPDEBUG
int ostate;

ostate = tp->t_state;
#endif
inp = tp->t_inpcb;
KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__,
tp));
INP_WLOCK(inp);
if (callout_pending(>t_timers->tt_keep) ### Use after free
of tp here
!callout_active(>t_timers->tt_keep)) {
INP_WUNLOCK(inp);
CURVNET_RESTORE();
return;
}
...

 The use-after-free scenario:

[CPU 1] the callout fires, tcp_timer_keep entered
[CPU 1] blocks on INP_WLOCK(inp);
[CPU 2] schedules tcp_timer_keep with callout_reset()
[CPU 2] tcp_discardcb called
[CPU 2] tcp_timer_keep callout successfully canceled
[CPU 2] tcpcb freed
[CPU 1] unblocks, the tcpcb is used

 Then the tcpcb will used just after being freed...  Might also crash or
not depending in the case.

 Extra notes:

 o The invariant I see here is:  The "callout successfully canceled"
step should never happen when "the callout is currently being executed".

 o Solutions I see to enforce this invariant:

 - First solution:  Use callout lock init with inp lock, your patch
seems to permit that now.

 - Second solution:  Change callout_async_drain() behavior:  It can
return 0 (fail) when the callout is currently being executed (no matter
what).

 - Third solution:  Don't trust callout_async_drain(callout) return
value of 1 (success) if the previous call of callout_reset(callout)
returned 0 (fail).  That was the exact purpose of r284261 change, but
this solution is also step backward in modernization of TCP
timers/callout...

https://svnweb.freebsd.org/base/stable/10/sys/netinet/tcp_timer.c?r1=284261=284260=284261

 Hopefully my description is clear enough...

--
Julien



signature.asc
Description: OpenPGP digital signature

Re: panic with tcp timers

2016-06-25 Thread Randall Stewart

Ok

Lets try this again with my source changed to my @freebsd.net :-)

Now I am also attaching a patch for you Gleb, this will take some poking to
get in to your NF-head since it incorporates some changes we made earlier.

I think this will fix the problem.. i.e. dealing with two locks in the callout 
system (which it was
never meant to have done)..

Note we probably can move the code to use the callout lock init now.. but lets 
see if this works
on your setup on c096 and if so we can think about doing that.

R



for_gleb
Description: Binary data

> On Jun 25, 2016, at 4:48 AM, Randall Stewart  wrote:
> 
> So
> 
> All of our timers in TCP do something like
> -
> INFO-LOCK
> INP_WLOCK
> 
> if (inp needs to be dropped) {
>drop-it
> }
> do other work
> 
> UNLOCK-INP
> UNLOCK-INFO
> --
> 
> And generally the path “inp needs to be dropped” is rarely taken.
> 
> So why don’t we change the procedure to something like:
> 
> INP_WLOCK
> if (inp needs to be dropped) {
> inp_ref()
> INP_WUNLOCK()
> INFO_LOCK()
> INP_WLOCK()
> if (inp_release_ref) {
>  /* victory its dead */
>   INFO_UNLOCK
>   return
>  }
>  do-the-drop
>  }
> 
> This way we would only go grab the INFO lock in those rarer cases
> when we *do* actually want to kill the tcb and at the same time
> it would make the current callout system work correctly.. which as
> many have pointed out would be much better if we could just let the
> lock be gotten by the callout system. Hmm maybe with this scheme we
> could even do that...
> 
> R
> 
> 
>> On Jun 20, 2016, at 1:45 PM, Julien Charbon  wrote:
>> 
>> 
>> Hi,
>> 
>> On 6/20/16 11:58 AM, Gleb Smirnoff wrote:
>>> On Mon, Jun 20, 2016 at 11:55:55AM +0200, Julien Charbon wrote:
>>> J> > On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
>>> J> > J> > Comparing stable/10 and head, I see two changes that could
>>> J> > J> > affect that:
>>> J> > J> > 
>>> J> > J> > - callout_async_drain
>>> J> > J> > - switch to READ lock for inp info in tcp timers
>>> J> > J> > 
>>> J> > J> > That's why you are in To, Julien and Hans :)
>>> J> > J> > 
>>> J> > J> > We continue investigating, and I will keep you updated.
>>> J> > J> > However, any help is welcome. I can share cores.
>>> J> > 
>>> J> > Now, spending some time with cores and adding a bunch of
>>> J> > extra CTRs, I have a sequence of events that lead to the
>>> J> > panic. In short, the bug is in the callout system. It seems
>>> J> > to be not relevant to the callout_async_drain, at least for
>>> J> > now. The transition to READ lock unmasked the problem, that's
>>> J> > why NetflixBSD 10 doesn't panic.
>>> J> > 
>>> J> > The panic requires heavy contention on the TCP info lock.
>>> J> > 
>>> J> > [CPU 1] the callout fires, tcp_timer_keep entered
>>> J> > [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
>>> J> > [CPU 2] schedules the callout
>>> J> > [CPU 2] tcp_discardcb called
>>> J> > [CPU 2] callout successfully canceled
>>> J> > [CPU 2] tcpcb freed
>>> J> > [CPU 1] unblocks... panic
>>> J> > 
>>> J> > When the lock was WLOCK, all contenders were resumed in a
>>> J> > sequence they came to the lock. Now, that they are readers,
>>> J> > once the lock is released, readers are resumed in a "random"
>>> J> > order, and this allows tcp_discardcb to go before the old
>>> J> > running callout, and this unmasks the panic.
>>> J> 
>>> J>  Highly interesting.  I should be able to reproduce that (will be useful
>>> J> for testing the corresponding fix).
>>> J> 
>>> J>  Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
>>> J> (success) here) when the callout cancellation is a success _but_ the
>>> J> callout is current running, that should fix it.
>>> J> 
>>> J>  For the history:  It comes back to my old callout question:
>>> J> 
>>> J>  Does _callout_stop_safe() is allowed to return 1 (success) even if the
>>> J> callout is still currently running;  a.k.a. it is not because you
>>> J> successfully cancelled a callout that the callout is not currently 
>>> running.
>>> J> 
>>> J>  We did propose a patch to make _callout_stop_safe() returns 0 (fail)
>>> J> when the callout is currently running:
>>> J> 
>>> J> callout_stop() should return 0 when the callout is currently being
>>> J> serviced and indeed unstoppable
>>> J> 
>>> https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most
>>> J> 
>>> J>  But this change impacted too many old code paths and was interesting
>>> J> only for TCP timers and thus was abandoned.
>>> 
>>> The fix I am working on now is doing exactly that. callout_reset must
>>> return 0 if the callout is currently running.
>>> 
>>> What are the old paths impacted?
>> 
>> Actually all the paths that check the callout_stop() return value AND
>> call both callout_reset() and callout_stop() AND use mpsafe callout().
>> And for each path, we would have to check our patch was ok (or not).

Re: panic with tcp timers

2016-06-25 Thread Randall Stewart

So

All of our timers in TCP do something like
-
INFO-LOCK
INP_WLOCK

if (inp needs to be dropped) {
drop-it
}
do other work

UNLOCK-INP
UNLOCK-INFO
--

And generally the path “inp needs to be dropped” is rarely taken.

So why don’t we change the procedure to something like:

INP_WLOCK
if (inp needs to be dropped) {
 inp_ref()
 INP_WUNLOCK()
 INFO_LOCK()
 INP_WLOCK()
 if (inp_release_ref) {
   /* victory its dead */
   INFO_UNLOCK
   return
  }
  do-the-drop
  }

This way we would only go grab the INFO lock in those rarer cases
when we *do* actually want to kill the tcb and at the same time
it would make the current callout system work correctly.. which as
many have pointed out would be much better if we could just let the
lock be gotten by the callout system. Hmm maybe with this scheme we
could even do that...

R


> On Jun 20, 2016, at 1:45 PM, Julien Charbon  wrote:
> 
> 
> Hi,
> 
> On 6/20/16 11:58 AM, Gleb Smirnoff wrote:
>> On Mon, Jun 20, 2016 at 11:55:55AM +0200, Julien Charbon wrote:
>> J> > On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
>> J> > J> > Comparing stable/10 and head, I see two changes that could
>> J> > J> > affect that:
>> J> > J> > 
>> J> > J> > - callout_async_drain
>> J> > J> > - switch to READ lock for inp info in tcp timers
>> J> > J> > 
>> J> > J> > That's why you are in To, Julien and Hans :)
>> J> > J> > 
>> J> > J> > We continue investigating, and I will keep you updated.
>> J> > J> > However, any help is welcome. I can share cores.
>> J> > 
>> J> > Now, spending some time with cores and adding a bunch of
>> J> > extra CTRs, I have a sequence of events that lead to the
>> J> > panic. In short, the bug is in the callout system. It seems
>> J> > to be not relevant to the callout_async_drain, at least for
>> J> > now. The transition to READ lock unmasked the problem, that's
>> J> > why NetflixBSD 10 doesn't panic.
>> J> > 
>> J> > The panic requires heavy contention on the TCP info lock.
>> J> > 
>> J> > [CPU 1] the callout fires, tcp_timer_keep entered
>> J> > [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
>> J> > [CPU 2] schedules the callout
>> J> > [CPU 2] tcp_discardcb called
>> J> > [CPU 2] callout successfully canceled
>> J> > [CPU 2] tcpcb freed
>> J> > [CPU 1] unblocks... panic
>> J> > 
>> J> > When the lock was WLOCK, all contenders were resumed in a
>> J> > sequence they came to the lock. Now, that they are readers,
>> J> > once the lock is released, readers are resumed in a "random"
>> J> > order, and this allows tcp_discardcb to go before the old
>> J> > running callout, and this unmasks the panic.
>> J> 
>> J>  Highly interesting.  I should be able to reproduce that (will be useful
>> J> for testing the corresponding fix).
>> J> 
>> J>  Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
>> J> (success) here) when the callout cancellation is a success _but_ the
>> J> callout is current running, that should fix it.
>> J> 
>> J>  For the history:  It comes back to my old callout question:
>> J> 
>> J>  Does _callout_stop_safe() is allowed to return 1 (success) even if the
>> J> callout is still currently running;  a.k.a. it is not because you
>> J> successfully cancelled a callout that the callout is not currently 
>> running.
>> J> 
>> J>  We did propose a patch to make _callout_stop_safe() returns 0 (fail)
>> J> when the callout is currently running:
>> J> 
>> J> callout_stop() should return 0 when the callout is currently being
>> J> serviced and indeed unstoppable
>> J> 
>> https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most
>> J> 
>> J>  But this change impacted too many old code paths and was interesting
>> J> only for TCP timers and thus was abandoned.
>> 
>> The fix I am working on now is doing exactly that. callout_reset must
>> return 0 if the callout is currently running.
>> 
>> What are the old paths impacted?
> 
> Actually all the paths that check the callout_stop() return value AND
> call both callout_reset() and callout_stop() AND use mpsafe callout().
> And for each path, we would have to check our patch was ok (or not).
> 
> Thus, if you only do the change in callout_async_drain() context, you
> don't impact the "old" callout_stop() behavior and get the desired
> behavior for the TCP timers.  Might be a good trade-off...
> 
> My 2 cents.
> 
> --
> Julien


Randall Stewart
r...@netflix.com
803-317-4952





___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Julien Charbon


 Hi,

On 6/20/16 11:58 AM, Gleb Smirnoff wrote:
> On Mon, Jun 20, 2016 at 11:55:55AM +0200, Julien Charbon wrote:
> J> > On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
> J> > J> > Comparing stable/10 and head, I see two changes that could
> J> > J> > affect that:
> J> > J> > 
> J> > J> > - callout_async_drain
> J> > J> > - switch to READ lock for inp info in tcp timers
> J> > J> > 
> J> > J> > That's why you are in To, Julien and Hans :)
> J> > J> > 
> J> > J> > We continue investigating, and I will keep you updated.
> J> > J> > However, any help is welcome. I can share cores.
> J> > 
> J> > Now, spending some time with cores and adding a bunch of
> J> > extra CTRs, I have a sequence of events that lead to the
> J> > panic. In short, the bug is in the callout system. It seems
> J> > to be not relevant to the callout_async_drain, at least for
> J> > now. The transition to READ lock unmasked the problem, that's
> J> > why NetflixBSD 10 doesn't panic.
> J> > 
> J> > The panic requires heavy contention on the TCP info lock.
> J> > 
> J> > [CPU 1] the callout fires, tcp_timer_keep entered
> J> > [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
> J> > [CPU 2] schedules the callout
> J> > [CPU 2] tcp_discardcb called
> J> > [CPU 2] callout successfully canceled
> J> > [CPU 2] tcpcb freed
> J> > [CPU 1] unblocks... panic
> J> > 
> J> > When the lock was WLOCK, all contenders were resumed in a
> J> > sequence they came to the lock. Now, that they are readers,
> J> > once the lock is released, readers are resumed in a "random"
> J> > order, and this allows tcp_discardcb to go before the old
> J> > running callout, and this unmasks the panic.
> J> 
> J>  Highly interesting.  I should be able to reproduce that (will be useful
> J> for testing the corresponding fix).
> J> 
> J>  Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
> J> (success) here) when the callout cancellation is a success _but_ the
> J> callout is current running, that should fix it.
> J> 
> J>  For the history:  It comes back to my old callout question:
> J> 
> J>  Does _callout_stop_safe() is allowed to return 1 (success) even if the
> J> callout is still currently running;  a.k.a. it is not because you
> J> successfully cancelled a callout that the callout is not currently running.
> J> 
> J>  We did propose a patch to make _callout_stop_safe() returns 0 (fail)
> J> when the callout is currently running:
> J> 
> J> callout_stop() should return 0 when the callout is currently being
> J> serviced and indeed unstoppable
> J> 
> https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most
> J> 
> J>  But this change impacted too many old code paths and was interesting
> J> only for TCP timers and thus was abandoned.
> 
> The fix I am working on now is doing exactly that. callout_reset must
> return 0 if the callout is currently running.
>
> What are the old paths impacted?

 Actually all the paths that check the callout_stop() return value AND
call both callout_reset() and callout_stop() AND use mpsafe callout().
And for each path, we would have to check our patch was ok (or not).

 Thus, if you only do the change in callout_async_drain() context, you
don't impact the "old" callout_stop() behavior and get the desired
behavior for the TCP timers.  Might be a good trade-off...

 My 2 cents.

--
Julien
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Julien Charbon

 Hi,

On 6/20/16 12:30 PM, Gleb Smirnoff wrote:
> On Mon, Jun 20, 2016 at 12:14:18PM +0200, Hans Petter Selasky wrote:
> H> On 06/20/16 11:58, Gleb Smirnoff wrote:
> H> > The fix I am working on now is doing exactly that. callout_reset must
> H> > return 0 if the callout is currently running.
> H> >
> H> > What are the old paths impacted?
> H> 
> H> I'll dig into the matter aswell and give some comments. Thanks for the 
> H> analysis, Gleb.
> H> 
> H> FYI: This class of problems wouldn't exist if the callout could be 
> H> associated with a mutex!
> 
> Exactly! I am convinced that all callouts should be locked, and non-locked
> one should simply go away, as well as async drain.
> 
> What does prevent us from converting TCP timeouts to locked? To my
> understanding it is the lock order of taking pcbinfo after pcb lock.
> 
> I'm now trying to understand Julien's conversion from pcbinfo lock
> to pcbinfo + pcblist locks. It seems to me that we actually can convert
> TCP timers to locked callouts.
> 
> What for do we need pcbinfo read lock in a tcp timer? The timer works
> only on the pcb and doesn't modify global lists, does it?

 tcp_timer_keep() for example can modify global pcb list, see the call
stack below:

tcp_timer_keep()
tcp_drop()
tcp_close()
sofree()
tcp_usr_detach() (via pr->pr_usrreqs->pru_detach() in sofree())
tcp_detach()
in_pcbfree()
in_pcbremlists()

 Anyway, a bit of history here:

 o In stable/10 the TCP locking order is:

ipi_lock (before) inpcb lock

 and in stable/10 ipi_lock is protecting the global pcb list.  Then,
only in the cases where you were absolutely sure you are _not_ going to
modify the global pcb list you are allowed to take the inpcb lock only.
For all the other cases, you have to take the write lock on ipi_lock
first due to lock order.

 And in context of short-lived TCP connection of the 5 received packets
for a TCP connection, 4 require the write ipi_lock lock, and that's does
not scale well.

 Lesson learned:  For scaling perspective, in lock ordering always put
the most global lock last.

 It was improved in a lean way in 11:

 o In 11 the TCP lock order became:

ipi_lock (before) inpcb lock (before) ipi_list

 And in 11 ipi_list protects global pcb list, and only the code actually
modifying the global list is protected by a global write lock, e.g.:

https://github.com/freebsd/freebsd/blob/master/sys/netinet/in_pcb.c#L1285

 Then why keeping the ipi_lock lock at all now?

 It is solely for one case:  When you need the complete stability of the
full pcb list while traversing it.  These full pcb list traversals are
done in functions like:  in_pcbnotifyall(), in_pcbpurgeif0(),
inp_apply_all(), tcp_ccalgounload(), tcp_drain(), etc.

 Thus in 11 ipi_lock write lock is required only when you do this full
traversal with list stability requirement, and the ipi_lock read lock in
all other cases like tcp_input() that then scales better.

 Sadly in 11, you cannot use the inpcb lock as is for the TCP timers
callout, because like tcp_timer_keep() most TCP timers callout can
modify the global pcb list and then you need the read lock ipi_lock in
top of inpcb lock...

 o Future/12:

 The next natural step is to remove the ipi_lock from the picture to get:

inpcb lock (before) ipi_list

 It /just/ requires a global pcb list that allows addition/deletion
while doing a full traversal concurrently.  A classical way to achieve
that, is to use a RCU-based list.  As soon as RCU (or anything
equivalent like lwref) are supported in kernel, we will implement this
change.

 Just history here, as I already did a presentation on this exact topic
at BSDCan 2015:
https://wiki.freebsd.org/201506DevSummit#line-83

 It was recorded but I never saw the footage/presentation actually
published. :)

--
Julien

signature.asc
Description: OpenPGP digital signature

Re: panic with tcp timers

2016-06-20 Thread Adrian Chadd

There's implications for RSS with how the callout system currently works.

If you don't know the RSS bucket ID of a connection in advance, you'll
create callouts on the wrong CPUs and then they're not migrated.

The initial work here did convert things over, but didn't place the
callouts in the right CPU/RSS bucket and there wasn't a mechanism to
fix this. :(

(But I'm also a firm believer of that too. I'd also finally just like
the callout system to not be tied to per-CPU queues, but also
per-RSS-bucket callout wheels so we could assign a CPU mask to a given
callout wheel and then migrating things around is just "change cpu
mask", not "change callout cpu id.")

-adrian

On 20 June 2016 at 04:00, Hans Petter Selasky  wrote:
> On 06/20/16 12:30, Gleb Smirnoff wrote:
>>
>> What does prevent us from converting TCP timeouts to locked? To my
>> understanding it is the lock order of taking pcbinfo after pcb lock.
>
>
> I started this work:
>
> https://reviews.freebsd.org/D1563
>
> --HPS
> ___
> freebsd-...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Hans Petter Selasky


On 06/20/16 12:30, Gleb Smirnoff wrote:

What does prevent us from converting TCP timeouts to locked? To my
understanding it is the lock order of taking pcbinfo after pcb lock.


I started this work:

https://reviews.freebsd.org/D1563

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Hans Petter Selasky


On 06/20/16 12:30, Gleb Smirnoff wrote:

Exactly! I am convinced that all callouts should be locked, and non-locked
one should simply go away, as well as async drain.


I agree about that that, except you still need the async drain, because 
it will prevent freeing the lock protecting the callout, which may still 
be in use after callout_stop().


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Konstantin Belousov

On Mon, Jun 20, 2016 at 11:55:55AM +0200, Julien Charbon wrote:
> 
>  Hi,
> 
> On 6/20/16 9:39 AM, Gleb Smirnoff wrote:
> > On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
> > J> > Comparing stable/10 and head, I see two changes that could
> > J> > affect that:
> > J> > 
> > J> > - callout_async_drain
> > J> > - switch to READ lock for inp info in tcp timers
> > J> > 
> > J> > That's why you are in To, Julien and Hans :)
> > J> > 
> > J> > We continue investigating, and I will keep you updated.
> > J> > However, any help is welcome. I can share cores.
> > 
> > Now, spending some time with cores and adding a bunch of
> > extra CTRs, I have a sequence of events that lead to the
> > panic. In short, the bug is in the callout system. It seems
> > to be not relevant to the callout_async_drain, at least for
> > now. The transition to READ lock unmasked the problem, that's
> > why NetflixBSD 10 doesn't panic.
> > 
> > The panic requires heavy contention on the TCP info lock.
> > 
> > [CPU 1] the callout fires, tcp_timer_keep entered
> > [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
> > [CPU 2] schedules the callout
> > [CPU 2] tcp_discardcb called
> > [CPU 2] callout successfully canceled
> > [CPU 2] tcpcb freed
> > [CPU 1] unblocks... panic
> > 
> > When the lock was WLOCK, all contenders were resumed in a
> > sequence they came to the lock. Now, that they are readers,
> > once the lock is released, readers are resumed in a "random"
> > order, and this allows tcp_discardcb to go before the old
> > running callout, and this unmasks the panic.
> 
>  Highly interesting.  I should be able to reproduce that (will be useful
> for testing the corresponding fix).
> 
>  Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
> (success) here) when the callout cancellation is a success _but_ the
> callout is current running, that should fix it.
> 
>  For the history:  It comes back to my old callout question:
> 
>  Does _callout_stop_safe() is allowed to return 1 (success) even if the
> callout is still currently running;  a.k.a. it is not because you
> successfully cancelled a callout that the callout is not currently running.
> 
>  We did propose a patch to make _callout_stop_safe() returns 0 (fail)
> when the callout is currently running:
> 
> callout_stop() should return 0 when the callout is currently being
> serviced and indeed unstoppable
> https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most
> 
>  But this change impacted too many old code paths and was interesting
> only for TCP timers and thus was abandoned.

Look at callout_stop CS_MIGRBLOCK flag and the fix in sleepq_check_timeout().
Or, at least, do not allow this use of callout_stop() to rot again, after
previous dozen regressions and fixes there.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Hans Petter Selasky


On 06/20/16 11:58, Gleb Smirnoff wrote:

J> callout_stop() should return 0 when the callout is currently being
J> serviced and indeed unstoppable
J> 
https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most




What are the old paths impacted?



Hi Gleb,

Digging through my e-mail archive rephrasing myself and comments about:

https://reviews.freebsd.org/D3078

There are two kinds of callouts. So-called MPSAFE callouts which doesn't 
have a mutex associated with it, and non-MPSAFE which do.


When you are associating a mutex with a callout, callout_stop() will 
_always_ cancel the callback even if it is pending, and this should be 
reflected by the return value. Your proposed changes will break that ???


The changes in D3078 didn't take into account "use_lock" at least, and 
so the return values for the non-MPSAFE case becomes incorrect.



Hi,

I think this patch is incomplete and can break the return value for non-MPSAFE callouts. 
I think the correct statement should check the value of "use_lock" too:

not_running = !(cc_exec_curr(cc, direct) == c && use_lock == 0);

Because if the callback process waits for lock "c_lock" in the callback process then we 
have above "cc_exec_curr(cc, direct) == c" is satisfied too, and non-MPSAFE callouts are 
always cancelable, via cc_exec_cancel(cc, direct) = true;


--HPS


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Gleb Smirnoff

On Mon, Jun 20, 2016 at 12:14:18PM +0200, Hans Petter Selasky wrote:
H> On 06/20/16 11:58, Gleb Smirnoff wrote:
H> > The fix I am working on now is doing exactly that. callout_reset must
H> > return 0 if the callout is currently running.
H> >
H> > What are the old paths impacted?
H> 
H> Hi,
H> 
H> I'll dig into the matter aswell and give some comments. Thanks for the 
H> analysis, Gleb.
H> 
H> FYI: This class of problems wouldn't exist if the callout could be 
H> associated with a mutex!

Exactly! I am convinced that all callouts should be locked, and non-locked
one should simply go away, as well as async drain.

What does prevent us from converting TCP timeouts to locked? To my
understanding it is the lock order of taking pcbinfo after pcb lock.

I'm now trying to understand Julien's conversion from pcbinfo lock
to pcbinfo + pcblist locks. It seems to me that we actually can convert
TCP timers to locked callouts.

What for do we need pcbinfo read lock in a tcp timer? The timer works
only on the pcb and doesn't modify global lists, does it?

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Hans Petter Selasky


On 06/20/16 11:58, Gleb Smirnoff wrote:

The fix I am working on now is doing exactly that. callout_reset must
return 0 if the callout is currently running.

What are the old paths impacted?


Hi,

I'll dig into the matter aswell and give some comments. Thanks for the 
analysis, Gleb.


FYI: This class of problems wouldn't exist if the callout could be 
associated with a mutex!


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Gleb Smirnoff

On Mon, Jun 20, 2016 at 11:55:55AM +0200, Julien Charbon wrote:
J> > On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
J> > J> > Comparing stable/10 and head, I see two changes that could
J> > J> > affect that:
J> > J> > 
J> > J> > - callout_async_drain
J> > J> > - switch to READ lock for inp info in tcp timers
J> > J> > 
J> > J> > That's why you are in To, Julien and Hans :)
J> > J> > 
J> > J> > We continue investigating, and I will keep you updated.
J> > J> > However, any help is welcome. I can share cores.
J> > 
J> > Now, spending some time with cores and adding a bunch of
J> > extra CTRs, I have a sequence of events that lead to the
J> > panic. In short, the bug is in the callout system. It seems
J> > to be not relevant to the callout_async_drain, at least for
J> > now. The transition to READ lock unmasked the problem, that's
J> > why NetflixBSD 10 doesn't panic.
J> > 
J> > The panic requires heavy contention on the TCP info lock.
J> > 
J> > [CPU 1] the callout fires, tcp_timer_keep entered
J> > [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
J> > [CPU 2] schedules the callout
J> > [CPU 2] tcp_discardcb called
J> > [CPU 2] callout successfully canceled
J> > [CPU 2] tcpcb freed
J> > [CPU 1] unblocks... panic
J> > 
J> > When the lock was WLOCK, all contenders were resumed in a
J> > sequence they came to the lock. Now, that they are readers,
J> > once the lock is released, readers are resumed in a "random"
J> > order, and this allows tcp_discardcb to go before the old
J> > running callout, and this unmasks the panic.
J> 
J>  Highly interesting.  I should be able to reproduce that (will be useful
J> for testing the corresponding fix).
J> 
J>  Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
J> (success) here) when the callout cancellation is a success _but_ the
J> callout is current running, that should fix it.
J> 
J>  For the history:  It comes back to my old callout question:
J> 
J>  Does _callout_stop_safe() is allowed to return 1 (success) even if the
J> callout is still currently running;  a.k.a. it is not because you
J> successfully cancelled a callout that the callout is not currently running.
J> 
J>  We did propose a patch to make _callout_stop_safe() returns 0 (fail)
J> when the callout is currently running:
J> 
J> callout_stop() should return 0 when the callout is currently being
J> serviced and indeed unstoppable
J> 
https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most
J> 
J>  But this change impacted too many old code paths and was interesting
J> only for TCP timers and thus was abandoned.

The fix I am working on now is doing exactly that. callout_reset must
return 0 if the callout is currently running.

What are the old paths impacted?

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-20 Thread Julien Charbon


 Hi,

On 6/20/16 9:39 AM, Gleb Smirnoff wrote:
> On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
> J> > Comparing stable/10 and head, I see two changes that could
> J> > affect that:
> J> > 
> J> > - callout_async_drain
> J> > - switch to READ lock for inp info in tcp timers
> J> > 
> J> > That's why you are in To, Julien and Hans :)
> J> > 
> J> > We continue investigating, and I will keep you updated.
> J> > However, any help is welcome. I can share cores.
> 
> Now, spending some time with cores and adding a bunch of
> extra CTRs, I have a sequence of events that lead to the
> panic. In short, the bug is in the callout system. It seems
> to be not relevant to the callout_async_drain, at least for
> now. The transition to READ lock unmasked the problem, that's
> why NetflixBSD 10 doesn't panic.
> 
> The panic requires heavy contention on the TCP info lock.
> 
> [CPU 1] the callout fires, tcp_timer_keep entered
> [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
> [CPU 2] schedules the callout
> [CPU 2] tcp_discardcb called
> [CPU 2] callout successfully canceled
> [CPU 2] tcpcb freed
> [CPU 1] unblocks... panic
> 
> When the lock was WLOCK, all contenders were resumed in a
> sequence they came to the lock. Now, that they are readers,
> once the lock is released, readers are resumed in a "random"
> order, and this allows tcp_discardcb to go before the old
> running callout, and this unmasks the panic.

 Highly interesting.  I should be able to reproduce that (will be useful
for testing the corresponding fix).

 Fix proposal:  If callout_async_drain() returns 0 (fail) (instead of 1
(success) here) when the callout cancellation is a success _but_ the
callout is current running, that should fix it.

 For the history:  It comes back to my old callout question:

 Does _callout_stop_safe() is allowed to return 1 (success) even if the
callout is still currently running;  a.k.a. it is not because you
successfully cancelled a callout that the callout is not currently running.

 We did propose a patch to make _callout_stop_safe() returns 0 (fail)
when the callout is currently running:

callout_stop() should return 0 when the callout is currently being
serviced and indeed unstoppable
https://reviews.freebsd.org/differential/changeset/?ref=62513=ignore-most

 But this change impacted too many old code paths and was interesting
only for TCP timers and thus was abandoned.

 My 2 cents.

--
Julien



signature.asc
Description: OpenPGP digital signature

Re: panic with tcp timers

2016-06-20 Thread Gleb Smirnoff

  Hi!

On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
J> > Comparing stable/10 and head, I see two changes that could
J> > affect that:
J> > 
J> > - callout_async_drain
J> > - switch to READ lock for inp info in tcp timers
J> > 
J> > That's why you are in To, Julien and Hans :)
J> > 
J> > We continue investigating, and I will keep you updated.
J> > However, any help is welcome. I can share cores.

Now, spending some time with cores and adding a bunch of
extra CTRs, I have a sequence of events that lead to the
panic. In short, the bug is in the callout system. It seems
to be not relevant to the callout_async_drain, at least for
now. The transition to READ lock unmasked the problem, that's
why NetflixBSD 10 doesn't panic.

The panic requires heavy contention on the TCP info lock.

[CPU 1] the callout fires, tcp_timer_keep entered
[CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
[CPU 2] schedules the callout
[CPU 2] tcp_discardcb called
[CPU 2] callout successfully canceled
[CPU 2] tcpcb freed
[CPU 1] unblocks... panic

When the lock was WLOCK, all contenders were resumed in a
sequence they came to the lock. Now, that they are readers,
once the lock is released, readers are resumed in a "random"
order, and this allows tcp_discardcb to go before the old
running callout, and this unmasks the panic.

Exact quote from ktrdump:

0xf82089867418 <-- faulty tcpcb
0xf82089867728 <-- its tt_keep

glebius@piston:~/cores:|>grep 0xf82089867418 ktr
19012192  11 scheduled 0xf820898677a8 func 0x80628740 arg 
0xf82089867418 in 20553.1e3ff819
19012190  11 rescheduled 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418 in 20613.04a6193d
19009551  11 rescheduled 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418 in 20613.042306cf
19009549  11 scheduled 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418 in 20563.0423409f
19009545  11 tcp_newtcpcb: 0xf82089867418
18962842  11 tcp_discardcb: free 0xf82089867418
18962830  11 tcp_discardcb: 0xf82089867418 draincnt 0
18962826  11 failed to stop 0xf820898677a8 func 0x80628740 arg 
0xf82089867418
18962822  11 cancelled3 0xf82089867768 func 0x806288e0 arg 
0xf82089867418
18962808  11 cancelled3 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418
18962804  11 failed to stop 0xf820898676e8 func 0x80628fa0 arg 
0xf82089867418
18962792  11 failed to stop 0xf820898676a8 func 0x806291e0 arg 
0xf82089867418
18962755  11 tcp_discardcb: 0xf82089867418
18962742  11 scheduled 0xf82089867768 func 0x806288e0 arg 
0xf82089867418 in 20632.ffc8d308
18962703  11 cancelled3 0xf820898676a8 func 0x806291e0 arg 
0xf82089867418
18962695  11 scheduled 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418 in 20612.ffc8ea28
18923275   6 tcp_timer_keep: 0xf82089867418
18923274   6 callout 0xf82089867728 func 0x80628bb0 arg 
0xf82089867418
17850361   9 scheduled 0xf820898676a8 func 0x806291e0 arg 
0xf82089867418 in 20553.5aec0004


-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-17 Thread Julien Charbon


 Hi Gleb,

On 6/17/16 6:53 AM, Gleb Smirnoff wrote:
>   At Netflix we are observing a race in TCP timers with head.
> The problem is a regression, that doesn't happen on stable/10.
> The panic usually happens after several hours at 55 Gbit/s of
> traffic.
> 
> What happens is that tcp_timer_keep finds t_tcpcb being
> NULL. Some coredumps have tcpcb already initialized,
> with non-NULL t_tcpcb and in TCPS_ESTABLISHED state. Which
> means that other CPU was working on the tcpcb while
> the faulted one was working on the panic. So, this all looks
> like a use after free, which conflicts with new allocation.
> 
> Comparing stable/10 and head, I see two changes that could
> affect that:
> 
> - callout_async_drain
> - switch to READ lock for inp info in tcp timers
> 
> That's why you are in To, Julien and Hans :)
> 
> We continue investigating, and I will keep you updated.
> However, any help is welcome. I can share cores.

 Thanks for sharing.  Let me run our TCP tests on a recent version of
HEAD to see if by chance I can reproduce it.  If I am not able to
reproduce it I will ask for debug kernel and cores and see if I can help.

 Few notes here:

 -  Around 2 months ago I did test HEAD with callout_async_drain() in
TCP timers with our TCP QA testsuite but no kernel panic.  That said I
did not let our test run during several hours.

 - At Verisign we run 10 with READ lock for inp info in tcp timers
change.  Again, it does not mean this change has no impact here.

 My 2 cents.

--
Julien



signature.asc
Description: OpenPGP digital signature

Re: panic with tcp timers

2016-06-17 Thread Bjoern A. Zeeb


On 17 Jun 2016, at 4:53, Gleb Smirnoff wrote:


  Hi!

  At Netflix we are observing a race in TCP timers with head.
The problem is a regression, that doesn't happen on stable/10.
The panic usually happens after several hours at 55 Gbit/s of
traffic.

What happens is that tcp_timer_keep finds t_tcpcb being
NULL. Some coredumps have tcpcb already initialized,
with non-NULL t_tcpcb and in TCPS_ESTABLISHED state. Which
means that other CPU was working on the tcpcb while
the faulted one was working on the panic. So, this all looks
like a use after free, which conflicts with new allocation.

Comparing stable/10 and head, I see two changes that could
affect that:

- callout_async_drain
- switch to READ lock for inp info in tcp timers

That's why you are in To, Julien and Hans :)

We continue investigating, and I will keep you updated.
However, any help is welcome. I can share cores.


There’s also the change to no longer mark the zones NO_FREE.
In theory I was convinced at the time that it should not be an issue 
anymore.


If I had overlooked something or follow-up timer changes invalidated 
assumptions then that could also be trouble.


That said, I was not able to get any related panics or log entries 
anymore lately (but I am currently slightly behind head with my branch).


We should get the problem fixed however and not try to “paint over” 
again.


/bz
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

2016-06-17 Thread Hans Petter Selasky


On 06/17/16 06:53, Gleb Smirnoff wrote:

  Hi!

  At Netflix we are observing a race in TCP timers with head.
The problem is a regression, that doesn't happen on stable/10.
The panic usually happens after several hours at 55 Gbit/s of
traffic.

What happens is that tcp_timer_keep finds t_tcpcb being
NULL. Some coredumps have tcpcb already initialized,
with non-NULL t_tcpcb and in TCPS_ESTABLISHED state. Which
means that other CPU was working on the tcpcb while
the faulted one was working on the panic. So, this all looks
like a use after free, which conflicts with new allocation.

Comparing stable/10 and head, I see two changes that could
affect that:

- callout_async_drain
- switch to READ lock for inp info in tcp timers

That's why you are in To, Julien and Hans :)

We continue investigating, and I will keep you updated.
However, any help is welcome. I can share cores.



Hi,

I do have projects/hps_head around, which is not that much behind 
11-current, which has a completely different callout implementation. If 
you can reproduce the issue separately might we worth a try to rule out 
the callout stack.


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

panic with tcp timers

2016-06-16 Thread Gleb Smirnoff

  Hi!

  At Netflix we are observing a race in TCP timers with head.
The problem is a regression, that doesn't happen on stable/10.
The panic usually happens after several hours at 55 Gbit/s of
traffic.

What happens is that tcp_timer_keep finds t_tcpcb being
NULL. Some coredumps have tcpcb already initialized,
with non-NULL t_tcpcb and in TCPS_ESTABLISHED state. Which
means that other CPU was working on the tcpcb while
the faulted one was working on the panic. So, this all looks
like a use after free, which conflicts with new allocation.

Comparing stable/10 and head, I see two changes that could
affect that:

- callout_async_drain
- switch to READ lock for inp info in tcp timers

That's why you are in To, Julien and Hans :)

We continue investigating, and I will keep you updated.
However, any help is welcome. I can share cores.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: EBR fix for life cycle races was Re: panic with tcp timers

EBR fix for life cycle races was Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

Re: panic with tcp timers

panic with tcp timers

27 matches

Site Navigation

Mail list logo

Footer information