On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote: > On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote: > > Marc Fournier wrote: > > > On 2013-02-13, at 3:54 PM, Rick Macklem <rmack...@uoguelph.ca> wrote: > > > > > > >> > > > > The pid that is in "T" state for the "ps auxlH". > > > > > > Different server, last kernel update on Jan 22nd, https process this > > > time instead of du last time. > > > > > > I've attached: > > > > > > ps auxlH > > > ps auxlH of just the processes that are in TJ state (6 httpd servers) > > > procstat output for each of the 6 process > > > > > > > > > > > > > > > They are included as attachments ??? if these don't make it through, let > > > me know, just figured I'd try and keep it compact ... > > Well, I've looked at this call path a little closer: > > 16693 104135 httpd - mi_switch+0x186 > thread_suspend_check+0x19f sleepq_catch_signals+0x1c5 > > sleepq_timedwait_sig+0x19 _sleep+0x2ca clnt_vc_call+0x763 > clnt_reconnect_call+0xfb newnfs_request+0xadb > > nfscl_request+0x72 nfsrpc_accessrpc+0x1df nfs34_access_otw+0x56 > nfs_access+0x306 vn_open_cred+0x5a8 > > kern_openat+0x20a amd64_syscall+0x540 Xfast_syscall+0xf7 > > > > I am probably way off, since I am not familiar with this stuff, but it > > seems to me that thread_suspend_check() should just return 0 for the > > case where stop_allowed == SIG_STOP_NOT_ALLOWED (TDF_SBDRY flag set) > > instead of sitting in the loop and doing a mi_switch(). I'm not even > > sure if it should call thread_suspend_check() for this case, but there > > are cases in thread_suspend_check() that I don't understand. > > > > Although I don't really understand thread_suspend_check(), I've attached > > a simple patch that might be a starting point for fixing this? > > > > I wouldn't recommend trying the patch until kib and/or jhb weigh in > > on whether it makes any sense. > > I think this is the right idea, but in HEAD with the sigdeferstop() changes > it > should just check for TDF_SBDRY instead of adding a new parameter. I think > checking for TDF_SBDRY will work even in 9 (and will make the patch smaller). > > Also, I think this is only needed for stop signals. Other suspend requests > will eventually resume the thread, it is only stop signals that can cause the > thread to get stuck indefinitely (since it depends on the user sending > SIGCONT). > > Marc, are you using SIGSTOP? > > Index: kern_thread.c > =================================================================== > --- kern_thread.c (revision 246122) > +++ kern_thread.c (working copy) > @@ -795,6 +795,17 @@ thread_suspend_check(int return_instead) > return (ERESTART); > > /* > + * Ignore suspend requests for stop signals if they > + * are deferred. > + */ > + if (P_SHOULDSTOP(p) == P_STOPPED_SIG && > + td->td_flags & TDF_SBDRY) { > + KASSERT(return_instead, > + ("TDF_SBDRY set for unsafe thread_suspend_check")); > + return (0); > + } > + > + /* > * If the process is waiting for us to exit, > * this thread should just suicide. > * Assumes that P_SINGLE_EXIT implies P_STOPPED_SINGLE.
This looks correct.
pgpwJnJsA6DUs.pgp
Description: PGP signature