On Fri, 15 Apr 2005, Jeff Moyer wrote:

==> Regarding Re: [patch, rfc] fix hung automounter issues; Ian Kent <[EMAIL 
PROTECTED]> adds:

raven> On Thu, 14 Apr 2005, Jeff Moyer wrote: I'm inpressed, this looks
raven> like a fine piece of work guys.  It's going to take a while to work
raven> through this.

raven> Also This is bound to make merging Denis Vlasenkos' logging patches
raven> really hard. So there's potentially a fair bit of work in this.

Well, you can go ahead and apply his patches, and I'll merge this stuff in
afterwards.  Just let me know when you have that all set in CVS.

Hi, Ian,

Dan Berrange did a spectacular job of diagnosing some autofs hangs he
was experiencing.  The automount daemon would appear hung, and an an
strace of the process would show sys_futex on the call stack.  Through
his debugging, he figured out that the problem was autofs issuing syslog
calls while in a signal handler.

raven> One, probably stupid question (humour me).

raven> Shouldn't we be able use syslog in a signal handler?

Nope:

http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04

I have to plead guilty to the screw up (at least I'm not the only guilty party) here and point out that there's also a few ioctl calls in the signal handling execution paths.


I should be able to the move the ioctl in ST_SHUTDOWN_PENDING to the st_shutdown_pending function but moving them in send_ready and send_fail will be somewhat harder.

What else have we missed?


So, there are a few ways to fix this that come to mind.

o We could not log in signal handlers.  I don't consider this an
acceptable solution, as we really need the debug messages generated
there.  o We could defer logging to non-signal handler context.  This is
in fact what the attached pattch does.  It queues the syslog messages,
and flushes them upon the next syslog that occurs outside of signal
handler context.  o We could open /dev/log directly.  This is likely a
bad idea as there is no standard for the interface there.  o We could
have a separate logging process, which we write to via a pipe.  I'm not
keen on this as it adds yet another process, and makes shutdown that
much more complicated.


raven> I actually like the last option but a multi threaded autofs is not raven> likely to happen for a long time.

It does not imply threading.  You can simply fork().

raven> Would another option be to make autofs signal safe by doing all the
raven> work in subprocesses?

There are a number of ways to address this by changing the control
structure of the daemon.  However, that seemed way to invasive at this
stage of the game.  The ideal solution is to do less in the sig_child
handler, deferring the work just like we do for all of the other caught
signals.

Note that in all of the above cases, we still need to implement a
signal-safe vsprintf.  That is what the bulk of this patch is.

raven> What about a signal safe syslog? Is this something we push to the raven> libc guys, both a signal safe vsprintf and syslog?

Well, we can certainly try, but this won't gain you anything in the short
term.

I'll think more about having a separate logging process.  If it seems easy
enough to implement, I'll post a first whack at it.  In the mean time, I
think we should continue to discuss our alternatives.

I must admit, the more I delve into this, the more the deffered logging idea grows on me.


Let me look at the patches a while longer.

Ian

_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to