On Tue, May 22, 2007 at 05:26:04PM -0400, Brian Reichert wrote:
> I've been testing auto_failback in our 2.0.7-based lcuster, and
> have found sometimes failback doesn't occur.
> 
> We're managing a virtual IP via a haresources file on a Red Hat 4
> box.
> 
> What I tracked down was that if the box powered down too quickly
> for heartbeat to clean up, a PID file was left in place:
> 
>   # ls -ld /usr/local/var/run/heartbeat.pid
>   -rw-r-----  1 root root 11 May 22 16:44 /usr/local/var/run/heartbeat.pid
>   # cat /usr/local/var/run/heartbeat.pid
>       3215
> 
> But, when heartbeat tries to start after a reboot:
> 
>   May 22 16:46:41 sqe-50 heartbeat: [3214]: WARN: Logging daemon
>   is disabled --enabling logging daemon is recommended
>   May 22 16:46:41 sqe-50 heartbeat: [3214]: info: **************************
>   May 22 16:46:41 sqe-50 heartbeat: [3214]: info: Configuration
>   validated.  Starting heartbeat 2.0.7
>   May 22 16:46:41 sqe-50 heartbeat: [3214]: info: heartbeat: already
>   running [pid 3215].
> 
> What I see in make_daemon() is a check for this file, and it's contents:
> 
>         /* See if heartbeat is already running... */
> 
>         if ((pid=cl_read_pidfile(PIDFILE)) > 0 && pid != getpid()) {
>                 cl_log(LOG_INFO, "%s: already running [pid %ld]."
>                 ,       cmdname, pid);
>                 exit(LSB_EXIT_OK);
>         }
> 
> But, there's no check to assure the recorded PID is not stale.
> 
> Have others seen this?  This code seems to be in 2.0.8 as well...

This should be fixed. I think I've noticed that in more than one
place. Do you feel like making a patch?

> 
> -- 
> Brian Reichert                                <[EMAIL PROTECTED]>
> 55 Crystal Ave. #286                  Daytime number: (603) 434-6842
> Derry NH 03038-1725 USA                       BSD admin/developer at large    
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to