Hi, On Thu, Nov 29, 2007 at 10:25:47AM +0000, Amos Shapira wrote: > On 29/11/2007, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > > Yes, very much so. For some reason the MCP (master control > > process) doesn't start the rest of the programs which are doing > > the real work. I really can't say why. Can you please attach the > > logs from this node? > > A pstree(1) on the better node visualizes the responsibility of > starting the programs pretty vividly: > > |-heartbeat,18449 > | |-attrd,18477 > | |-ccm,18473 > | |-cib,18474 > | |-crmd,18478 > | | |-pengine,18505 > | | `-tengine,18504 > | |-heartbeat,18452 > | |-heartbeat,18453 > | |-heartbeat,18454 > | |-heartbeat,18455 > | |-heartbeat,18456 > | |-lrmd,18475 -r > | |-mgmtd,18479 -v > | `-stonithd,18476 > > Here they are again (from tonight): > > 1 heartbeat[17481]: 2007/11/29_07:12:40 WARN: heartbeat: udp > port 695 reserved for service "ieee-mms-ssl". > 2 heartbeat[17481]: 2007/11/29_07:12:40 info: Version 2 support: yes > 3 heartbeat[17481]: 2007/11/29_07:12:40 WARN: File > /etc/ha.d/haresources exists. > 4 heartbeat[17481]: 2007/11/29_07:12:40 WARN: This file is not > used because crm is enabled > 5 heartbeat[17481]: 2007/11/29_07:12:40 WARN: Logging daemon is > disabled --enabling logging daemon is recommended > 6 heartbeat[17481]: 2007/11/29_07:12:40 info: ************************** > 7 heartbeat[17481]: 2007/11/29_07:12:40 info: Configuration > validated. Starting heartbeat 2.1.2 > 8 heartbeat[17482]: 2007/11/29_07:12:40 info: heartbeat: version 2.1.2 > 9 heartbeat[17482]: 2007/11/29_07:12:40 info: Heartbeat > generation: 1196102397 > 10 heartbeat[17482]: 2007/11/29_07:12:40 info: > G_main_add_TriggerHandler: Added signal manual handler > 11 heartbeat[17482]: 2007/11/29_07:12:40 info: > G_main_add_TriggerHandler: Added signal manual handler > 12 heartbeat[17482]: 2007/11/29_07:12:40 info: Removing > /var/run/heartbeat/rsctmp failed, recreating. > 13 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write > socket priority set to IPTOS_LOWDELAY on eth0 > 14 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound > send socket to device: eth0 > 15 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound > receive socket to device: eth0 > 16 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: > started on port 695 interface eth0 to 192.168.0.248 > 17 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write > socket priority set to IPTOS_LOWDELAY on eth0 > 18 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound > send socket to device: eth0 > 19 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound > receive socket to device: eth0 > 20 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: > started on port 695 interface eth0 to 192.168.0.249 > 21 heartbeat[17482]: 2007/11/29_07:12:40 info: > G_main_add_SignalHandler: Added signal handler for signal 17 > 22 heartbeat[17482]: 2007/11/29_07:12:40 info: Local status now > set to: 'up' > 23 heartbeat[17482]: 2007/11/29_07:12:41 info: Link > drbd01.test.spammatters.local:eth0 up. > 24 heartbeat[17482]: 2007/11/29_07:12:41 info: Status update for > node drbd01.test.spammatters.local: status up > 25 heartbeat[17482]: 2007/11/29_07:13:45 info: all clients are now paused > 26 heartbeat[17482]: 2007/11/29_07:13:45 debug: hist->ackseq =0 > 27 heartbeat[17482]: 2007/11/29_07:13:45 debug: hist->lowseq =0, > hist->hiseq=101 > 28 heartbeat[17482]: 2007/11/29_07:13:45 debug: expecting from > drbd01.test.spammatters.local > 29 heartbeat[17482]: 2007/11/29_07:13:45 debug: it's ackseq=0
heartbeat is getting no packet acknowledgements from drbd01. It must be a communication problem. Looks like drbd02 doesn't see packets coming from drbd01, assuming that it's sending them, which it does if there are no errors reported in drbd01. Thanks, Dejan > 30 heartbeat[17482]: 2007/11/29_07:13:45 debug: > > (The line numbers might come handy in discussing them). > > The last five "debug:" lines repeat ad-infinitum. > > Thanks very much. > > --Amos > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
