On 29/11/2007, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
> Yes, very much so. For some reason the MCP (master control
> process) doesn't start the rest of the programs which are doing
> the real work. I really can't say why. Can you please attach the
> logs from this node?

Another thing that I just did was to run the BasicSanityCheck, which
came up with interesting results (including a core dump!):

# ./BasicSanityCheck
Using interface: eth0
Starting base64 and md5 algorithm tests
base64 and md5 algorithm tests succeeded.
Starting Resource Agent tests
Testing RA: Dummy
Testing RA: IPaddr
Testing RA: IPaddr2
Testing RA: Filesystem
RA tests PASSED
Starting IPC tests
Starting heartbeat
Starting High-Availability services:
2007/11/30_08:07:38 INFO:  Resource is stopped
                                                           [  OK  ]
Reloading heartbeat
Reloading heartbeat
Stopping heartbeat
Stopping High-Availability services:
                                                           [  OK  ]
Checking STONITH basic sanity.
./BasicSanityCheck: line 647:  7913 Segmentation fault      (core
dumped) $STONITH -h >/dev/null
/usr/sbin/stonith -h failed
/usr/sbin/stonith -h help message is too short (0 lines)
Performing apphbd success case tests
Performing apphbd failure case tests
Starting LRM tests
Starting heartbeat
Starting High-Availability services:
2007/11/30_08:08:47 INFO:  Resource is stopped
                                                           [  OK  ]
starting STONITH Daemon tests
[EMAIL PROTECTED] log]#

I repeated the test after moving haresources aside (because I saw
somewere that it might happen during parsing of this file) but get
similar results.

The stack trace on these core files (found under /etc/ha.d) is:

#0  0x0000003acdc02280 in ?? ()
#1  0x0000003ac9042b02 in g_slice_free1 () from /lib64/libglib-2.0.so.0
#2  0x0000003ac9020ee4 in g_hash_table_remove () from /lib64/libglib-2.0.so.0
#3  0x0000003148003177 in DelPILPluginUniv () from /usr/lib64/libpils.so.1
#4  0x0000003148002ebb in ?? () from /usr/lib64/libpils.so.1
#5  0x0000003148003ccf in PILGetDebugLevel () from /usr/lib64/libpils.so.1
#6  0x0000003148004894 in PILLoadPlugin () from /usr/lib64/libpils.so.1
#7  0x0000003148003917 in PILGetDebugLevel () from /usr/lib64/libpils.so.1
#8  0x0000003148003a30 in PILGetDebugLevel () from /usr/lib64/libpils.so.1
#9  0x0000003148003c5f in PILGetDebugLevel () from /usr/lib64/libpils.so.1
#10 0x0000003148003d49 in PILIncrIFRefCount () from /usr/lib64/libpils.so.1
#11 0x0000003148c030b0 in stonith_delete () from /usr/lib64/libstonith.so.1
#12 0x0000000000401527 in confhelp ()
#13 0x0000000000401222 in usage ()
#14 0x0000000000401c46 in main ()

Which is what described in bug #1678
(http://developerbugs.linux-foundation.org//show_bug.cgi?id=1678),
that bug doesn't seem to be handled by the developers.

Thanks,

--Amos
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to