Well, since we've been seeing variations of this pretty consistently when they IPL at 
midnight on Sunday morning, even if it fixes the problem PART of the time, we're in 
better shape than we were.
The timing window for this seems to be very tight, given that the small additional 
delay we see with our test system is enough to prevent it.

As far as leaving blogd up, you'd have to take that up with SuSE.  blogd is only used 
before syslog gets going, so once the system is up, it's pretty much useless.  Safe 
enough to kill it once you're
sure it's no longer needed.

I just had a look at rc on SLES8, and the script is ALMOST identical with SLES7, so I 
suspect this will be a problem there too.

> -----Original Message-----
> From: Dennis Wicks [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 20, 2003 10:53 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [LINUX-390] Problem with SYSVINIT on SuSE SLES7
>
>
> Greetings;
>
> This sort of begs the question, why does blogd need to be killed off?
>
> As regards "consistently prevented," I predict, based upon my (mumble)
> years of experience, that it will work "consistently" until
> sometime that
> one or more of the following conditions are met:
>
>     - It is 2:00 AM Sunday morning
>     - You are on vacation
>     - You are on the other side of the country/continent/world
>     - You are completely unreachable
>
> If blogd really *needs* to be killed off then it would be
> much better to
> figure out a method of monitoring the process of the startup
> dynamically
> and killing blogd when it is really not needed any more.
> Otherwise just
> letting it run is probably the safest course.
>
> Good Luck!
> Dennis
>
>
>
>
>                     "Hall, Ken (IDS
>                     ECCS)"                To:
> [EMAIL PROTECTED]
>                     <[EMAIL PROTECTED]       cc:
>                     e.ml.com>             Subject:
> Problem with SYSVINIT on SuSE SLES7
>                     Sent by: Linux
>                     on 390 Port
>                     <[EMAIL PROTECTED]
>                     ARIST.EDU>
>
>
>                     03/20/2003
>                     08:23 AM
>                     Please respond
>                     to Linux on 390
>                     Port
>
>
>
>
>
>
> For a while we've been bothered by a problem on our SLES7
> systems where
> daemons don't start at boot time.  We thought it was related to the
> products themselves, but recently I noticed that it almost
> always seemed to be the LAST daemon in the list that didn't
> start.  I've
> looked into it, and I think I've identified a race condition in
> /etc/init.d/rc that accounts for this.
>
> When the script runs, it logs it's messages to /var/log/boot.log via a
> daemon called blogd.  After the last script runs, rc sends
> QUIT to blogd to
> shut it down. On a very fast machine (such as a
> z900), it's possible that the last daemon hasn't completed
> logging messages
> when this happens.  The QUIT signal seems to propagate back
> to the starting
> daemon as a SIGHUP, causing it to fail.
>
> We had noticed that the problem seemed to occur most frequently on our
> production systems, and not often on test.  This is
> consistent since our
> test system is somewhat slower, so would be less likely
> to experience the problem.
>
> The attached patch inserts a 2 second sleep before sending
> the kill signal.
> I've tested it multiple times, and it has consistently
> prevented the last
> daemon from failing.
>
> -- rc~ Thu Mar 20 09:04:24 2003
> +++ rc  Thu Mar 20 09:04:00 2003
> @@ -187,8 +187,9 @@
>  fi
>
>  #
> -# Stop blogd if running
> +# Stop blogd if running (wait 3 seconds for last guy to
> finish logging)
>  #
> +sleep 2
>  killproc -QUIT /sbin/blogd
>
>  #
>

Reply via email to