> For a while we've been bothered by a problem on our SLES7 systems where daemons > don't start at boot time. We thought it was related to the products themselves, but > recently I noticed that it almost > always seemed to be the LAST daemon in the list that didn't start. I've looked into > it, and I think I've identified a race condition in /etc/init.d/rc that accounts for > this. > > When the script runs, it logs it's messages to /var/log/boot.log via a daemon called > blogd. After the last script runs, rc sends QUIT to blogd to shut it down. On a > very fast machine (such as a > z900), it's possible that the last daemon hasn't completed logging messages when > this happens. The QUIT signal seems to propagate back to the starting daemon as a > SIGHUP, causing it to fail. > > We had noticed that the problem seemed to occur most frequently on our production > systems, and not often on test. This is consistent since our test system is > somewhat slower, so would be less > likely to experience the problem. > > The attached patch inserts a 2 second sleep before sending the kill signal. I've > tested it multiple times, and it has consistently prevented the last daemon from > failing. > > -- rc~ Thu Mar 20 09:04:24 2003 > +++ rc Thu Mar 20 09:04:00 2003 > @@ -187,8 +187,9 @@ > fi > > # > -# Stop blogd if running > +# Stop blogd if running (wait 3 seconds for last guy to finish logging) > # > +sleep 2 > killproc -QUIT /sbin/blogd > > #
