qpc is the just the caller of the last successfull *acquired* qlock.
what we know is that the exportfs proc spins in the q->use taslock
called by qlock() right? this already seems wired... q->use is held
just long enougth to test q->locked and manipulate the queue. also
sched() will avoid switching to another proc while we are holding tas
locks.
i would like to know which qlock is the kernel is trying to acquire
on behalf of exportfs that is also reachable from the etherread4
code.
one could move:
up->qpc = getcallerpc(&q);
from qlock() before the lock(&q->use); so we can see from where that
qlock gets called that hangs the exportfs call, or add another magic
debug pointer (qpctry) to the proc stucture and print it in dumpaproc().
--
cinap
--- Begin Message ---
> > acid: src(0xf0148c8a)
> > /sys/src/9/ip/tcp.c:2096
> > 2091 if(waserror()){
> > 2092 qunlock(s);
> > 2093 nexterror();
> > 2094 }
> > 2095 qlock(s);
> >>2096 qunlock(tcp);
> > 2097
> > 2098 /* fix up window */
> > 2099 seg.wnd <<= tcb->rcv.scale;
> > 2100
> > 2101 /* every input packet in puts off the keep alive time
> > out */
>
> The source actually says (to be pedantic):
>
> /* The rest of the input state machine is run with the control block
> * locked and implements the state machine directly out of the RFC.
> * Out-of-band data is ignored - it was always a bad idea.
> */
> tcb = (Tcpctl*)s->ptcl;
> if(waserror()){
> qunlock(s);
> nexterror();
> }
> qlock(s);
> qunlock(tcp);
>
> Now, the qunlock(s) should not precede the qlock(s), this is the first
> case in this procedure:
it doesn't. waserror() can't be executed before the code
following it. perhpas it could be more carefully written
as
> > 2095 qlock(s);
> > 2091 if(waserror()){
> > 2092 qunlock(s);
> > 2093 nexterror();
> > 2094 }
> >>2096 qunlock(tcp);
but it really wouldn't make any difference.
i'm not completely convinced that tcp's to blame.
and if it is, i think the problem is probablly tcp
timers.
- erik
--- End Message ---