On Tue, Nov 13, 2018 at 10:04:33AM +0000, Daniel P. Berrangé wrote: > On Fri, Nov 09, 2018 at 12:19:30PM +0000, Richard W.M. Jones wrote: > > Peter Dimitrov and myself were debugging a very peculiar bug when > > libguestfs is run as a plugin from collectd: > > > > https://www.redhat.com/archives/libguestfs/2018-November/thread.html#00023 > > > > The long story short is that collectd leaks SIGCHLD == SIG_IGN setting > > into plugins: > > > > https://www.redhat.com/archives/libguestfs/2018-November/msg00095.html > > > > This means that any plugin that does the usual pattern of: > > > > pid = fork (); > > ... > > if (waitpid (pid, NULL, 0) == -1) { > > perror ("waitpid"); > > exit (EXIT_FAILURE); > > } > > > > will fail, because the forked subprocess is automatically reaped > > before waitpid is called, resulting in the wait failing with errno == > > ECHILD. > > > > It is possible to work around this by adding: > > > > signal (SIGCHLD, SIG_DFL); > > > > to the plugin. However I believe this is a bug in collectd, and it > > should sanitize signals (and maybe other things) before running > > plugins. > > Yes, I'm inclined to agree. If an app or library is spawning external > processes it should take care to restore signal setup to a "normal" state. > This means not only setting SIG_DFL for all signals, but also just as > importantly, the signal mask. It should be set to all ones before calling > fork, and set back to all zeroes immedaitely before execve, as illustrated > in libvirt to avoid races: > > https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircommand.c;h=de937f6f9aa91abb518eac98bfac9dcf37e1f5df;hb=HEAD#l304 > > https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircommand.c;h=de937f6f9aa91abb518eac98bfac9dcf37e1f5df;hb=HEAD#l360
This is absolutely right. What's interesting and new here is that collectd is actually using dlopen to call external plugins (so not forking itself). See collectd.git/src/daemon/plugin.c However it's still not sanitizing the environment sufficiently around these calls, so it's the same sort of problem, but in a different context that I'd not seen/understood before. nbdkit has the same problem :-( We already dealt with umask a few years ago, now I've got to check our signal handlers too ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org _______________________________________________ collectd mailing list [email protected] https://mailman.verplant.org/listinfo/collectd
