Hi,
I got unexpected ERROR message when I tested Heartbeat process failure.
ha.cf:
-----
crm on
use_logd on
keepalive 1
deadtime 10
initdead 40
warntime 5
udpport 694
bcast eth0
node node01
node node02
watchdog /dev/watchdog
-----
heartbeat version: 2.1.4
OS version: RHEL 5.1
The test procedure:
1. start heartbeat
# /etc/init.d/heartbeat start
2. kill heartbeat process
# kill -9 <"heartbeat: write" or "heartbeat: read" process>
These processes are restarted.
3. stop heartbeat
# /etc/init.d/heartbeat stop
I get ERROR message in this stop process.
---- ha-log -----
heartbeat[4632]: 2008/09/09_14:43:41 ERROR: Watchdog write
magic character failure: closing /dev/watchdog!: Bad file descriptor
heartbeat[4632]: 2008/09/09_14:43:41 ERROR: Watchdog close(2)
failed.: Bad file descriptor
-----------------
I think that this is the same cause as Bugzilla No.1702 and I make patch.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1702
Please check attached patch.
Best Regards,
---
OKADA Satoshi
NTT Open Source Software Center
--- heartbeat/heartbeat.c.orig 2008-09-09 15:08:30.000000000 +0900
+++ heartbeat/heartbeat.c 2008-09-09 15:10:37.000000000 +0900
@@ -679,7 +679,7 @@
break;
case 0: /* Child */
- close(watchdogfd);
+ hb_close_watchdog();
curproc = &procinfo->info[fifoproc];
cl_malloc_setstats(&curproc->memstats);
cl_msg_setstats(&curproc->msgstats);
@@ -798,7 +798,7 @@
break;
case 0: /* Child */
- close(watchdogfd);
+ hb_close_watchdog();
curproc = &procinfo->info[ourproc];
cl_malloc_setstats(&curproc->memstats);
cl_msg_setstats(&curproc->msgstats);
@@ -832,7 +832,7 @@
break;
case 0: /* Child */
- close(watchdogfd);
+ hb_close_watchdog();
curproc = &procinfo->info[ourproc];
cl_malloc_setstats(&curproc->memstats);
cl_msg_setstats(&curproc->msgstats);
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/