You can use dwarf4 if you use GDB from portsĀ ---- On Wed, 15 Jun 
2016 04:50:00 -0700  Peter Holm<> wrote ----On Wed, Jun 15, 2016 
at 11:11:43AM +0300, Konstantin Belousov wrote: > On Tue, Jun 14, 2016 at 
10:26:14PM -0500, Eric Badger wrote: > > I believe they all have more or less 
the same cause. The crashes occur  > > because we acquire a knlist lock via the 
KN_LIST_LOCK macro, but when we  > > call KN_LIST_UNLOCK, the knote???s knlist 
reference (kn->kn_knlist) has  > > been cleared by another thread. Thus we are 
unable to unlock the  > > previously acquired lock and hold it until something 
causes us to crash  > > (such as the witness code noticing that we???re 
returning to userland with  > > the lock still held). > ... > > I believe 
there???s also a small window where the KN_LIST_LOCK macro  > > checks 
kn->kn_knlist and finds it to be non-NULL, but by the time it  > > actually 
dereferences it, it has become NULL. This would produce the  > > ???page fault 
while in kernel mode??? crash. > >  > > If someone familiar with this code sees 
an obvious fix, I???ll be happy to  > > test it. Otherwise, I???d appreciate 
any advice on fixing this. My first  > > thought is that a ???struct knote??? 
ought to have its own mutex for  > > controlling access to the flag fields and 
ideally the ???kn_knlist??? field.  > > I.e., you would first acquire a 
knote???s lock and then the knlist lock,  > > thus ensuring that no one could 
clear the kn_knlist variable while you  > > hold the knlist lock. The knlist 
lock, however, usually comes from  > > whichever event producing entity the 
knote tracks, so getting lock  > > ordering right between the per-knote mutex 
and this other lock seems  > > potentially hard. (Sometimes we call into 
functions in kern_event.c with  > > the knlist lock already held, having been 
acquired in code outside of  > > kern_event.c. Consider, for example, calling 
KNOTE_LOCKED from  > > kern_exit.c; the PROC_LOCK macro has already been used 
to acquire the  > > process lock, also serving as the knlist lock). > This 
sounds as a good and correct analysis. I tried your test program > for around a 
hour on 8-threads machine, but was not able to trigger the > issue. Might be 
Peter have better luck reproducing them. Still, I think > that the problem is 
there. >  > IMO we should simply avoid clearing kn_knlist in knlist_remove().  
The > member is only used to get the locking function pointers, otherwise > 
code relies on KN_DETACHED flag to detect on-knlist condition.  See > the patch 
below. >  > >  > > Apropos of the knlist lock and its provenance: why is a lock 
from the  > > event producing entity used to control access to the knlist and 
knote?  > > Is it generally desirable to, for example, hold the process lock 
while  > > operating on a knlist attached to that process? It???s not obvious 
to me  > > that this is required or even desirable. This might suggest that a  
> > knlist should have its own lock rather than using a lock from the event  > 
> producing entity, which might make addressing this problem more  > > 
straightforward. >  > Consider the purpose of knlist. It serves as a container 
for all knotes > registered on the given subsystem object, like all knotes of 
the socket, > process etc which must be fired on event. See the knote() code. 
The > consequence is that the subsystem which fires knote() typically already > 
holds a lock protecting its own state. As result, it is natural to > protect 
the list of the knotes to activate on subsystem event, by the > subsystem lock. 
>  > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > index 
0614903..3f45dca 100644 > --- a/sys/kern/kern_event.c  There is not much gdb 
info here; I'll try to rebuild kgdb.  The number of CPUs 
seems important to this test. Four works for me.  - Peter 
mailing list To 
unsubscribe, send any mail to "" 


_______________________________________________ mailing list
To unsubscribe, send any mail to ""

Reply via email to