Re: Kqueue races causing crashes

Matthew Macy Wed, 15 Jun 2016 12:35:40 -0700

 ---- On Wed, 15 Jun 2016 10:45:24 -0700 Konstantin Belousov 
<kostik...@gmail.com> wrote ---- 
 > On Wed, Jun 15, 2016 at 10:39:42AM -0700, Matthew Macy wrote: 
 > >  
 > >          
 > >  
 > >          
 > >             You can use dwarf4 if you use GDB from ports 
 > How would it help ? 

The following statement to a  native speaker would imply that GDB is the 
problem: "There is not much gdb info here; I'll try to rebuild kgdb."

If in fact %rip has been smashed that's a bit like saying "the light doesn't 
show anything on the table, I'll replace the light bulb" - when in fact there 
isn't anything on the table.  

 > Problem for kgdb is that %rip is zero, due to function pointer being set 
 > to NULL in a destroyed knlist.  Either version of kgdb would not find 
 > neither code nor unwind annotations for zero address. 
 >  
 > But the issue is understood and 

Yes. Since the initial e-mail.


> we are working on the version of fix. 

I'm glad you're on it.

-M



 >  
 >  ---- On Wed, 15 Jun 2016 04:50:00 -0700  Peter Holm<pe...@holm.cc> wrote 
 > ----On Wed, Jun 15, 2016 at 11:11:43AM +0300, Konstantin Belousov wrote: > 
 > On Tue, Jun 14, 2016 at 10:26:14PM -0500, Eric Badger wrote: > > I believe 
 > they all have more or less the same cause. The crashes occur  > > because we 
 > acquire a knlist lock via the KN_LIST_LOCK macro, but when we  > > call 
 > KN_LIST_UNLOCK, the knote???s knlist reference (kn->kn_knlist) has  > > been 
 > cleared by another thread. Thus we are unable to unlock the  > > previously 
 > acquired lock and hold it until something causes us to crash  > > (such as 
 > the witness code noticing that we???re returning to userland with  > > the 
 > lock still held). > ... > > I believe there???s also a small window where 
 > the KN_LIST_LOCK macro  > > checks kn->kn_knlist and finds it to be 
 > non-NULL, but by the time it  > > actually dereferences it, it has become 
 > NULL. This would produce the  > > ???page fault while in kernel mode??? 
 > crash. > >  > > If someone fami
 liar with this code sees an obvious fix, I???ll be happy to  > > test it. 
Otherwise, I???d appreciate any advice on fixing this. My first  > > thought is 
that a ???struct knote??? ought to have its own mutex for  > > controlling 
access to the flag fields and ideally the ???kn_knlist??? field.  > > I.e., you 
would first acquire a knote???s lock and then the knlist lock,  > > thus 
ensuring that no one could clear the kn_knlist variable while you  > > hold the 
knlist lock. The knlist lock, however, usually comes from  > > whichever event 
producing entity the knote tracks, so getting lock  > > ordering right between 
the per-knote mutex and this other lock seems  > > potentially hard. (Sometimes 
we call into functions in kern_event.c with  > > the knlist lock already held, 
having been acquired in code outside of  > > kern_event.c. Consider, for 
example, calling KNOTE_LOCKED from  > > kern_exit.c; the PROC_LOCK macro has 
already been used to acquire the  > > process lock, also serving  
 > >          
 > >          
 > >  
 > >      
 > >      
 > >  
 > 

_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Kqueue races causing crashes

Reply via email to