On Friday 04 May 2007 16:14, Andrew Beekhof wrote: > On 5/4/07, Max Hofer <[EMAIL PROTECTED]> wrote: > > I tried some power-off tests and after runing on both cluster > > nodes at the same time they sometimes go havoc. > > > > I run into 2 problems: > > 1.) on one cluster heartbeat shutdown with > > ERROR: Cannot write to media pipe 0: Resource temporarily unavailable > > > > 2.) a return code from a resource agent got dropped with error message > > crmd[1299]: 2007/05/04_10:58:22 WARN: msg_to_op(1173): failed to get the > > value of field lrm_opstatus from a ha_msg > > I think opening bugs for these is the best idea. > 1) should probably be against the "other" component #1567 > 2) should probably be against the "lrmd" component (since its an lrm #1568 > library message) #1569: created another bug entry for action enumeration problem during DC failover
> > > > > Pre-Condition: > > * cluster nodes have interconnected via: > > - RS232 > > - bond0 via ucast (normal LAN) > > - bond1 (bcast - intra LAN between cluster nodes where DRBD device > > is syncronized) > > * seems the communication using ttyS01 does not work (i have to check > > the cabling) > > > > attached two log files - and a short resumee what happened: > > * 10:57:02 > > - both server are powered up and start up heartbeat at the same time > > - management2 DC > > - management1 went into "primary state", i.e. starts the cluster resouce > > defining a node as pirmary > > * 10:58:04 > > - heartbeat on management2 crashes > > ERROR: Cannot write to media pipe 0: Resource temporarily unavailable > > * 10:58:21 > > - management1 was elected as new DC > > (strange timeouts: the action numbers which timeout on management2 do > > match the numbers after the DC switch --- is that normal) > > ---> fail-count change > > * 10:58:22 > > - failed to get lrm_opstatus from ha_msg ---> rc for db_mgmt resource start > > is lost > > > > crmd[1299]: 2007/05/04_10:58:22 WARN: msg_to_op(1173): failed to get the > > value of field lrm_opstatus from a ha_msg > > crmd[1299]: 2007/05/04_10:58:22 info: msg_to_op: Message follows: > > crmd[1299]: 2007/05/04_10:58:22 info: MSG: Dumping message with 13 fields > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[0] : [lrm_t=op] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[1] : [lrm_rid=db_mgmt] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[2] : [lrm_op=monitor] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[3] : [lrm_timeout=120000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[4] : [lrm_interval=120000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[5] : [lrm_delay=60000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[6] : [lrm_targetrc=-2] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[7] : [lrm_app=crmd] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[8] : > > [lrm_userdata=81:4:3e4ad4f1-ae5f-4d79-8f5b-db752a9d1121] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[9] : [(2)lrm_param=0x82180e0(199 > > 245)] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG: Dumping message with 8 fields > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[0] : [CRM_meta_interval=120000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[1] : [CRM_meta_start_delay=60000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[2] : [startup_timeout=60] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[3] : [CRM_meta_id=db-mgmt-monitor] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[4] : [CRM_meta_timeout=120000] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[5] : [crm_feature_set=1.0.7] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[6] : [pgdb=dmc] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[7] : [CRM_meta_name=monitor] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[10] : [lrm_callid=60] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[11] : [lrm_app=crmd] > > crmd[1299]: 2007/05/04_10:58:22 info: MSG[12] : [lrm_callid=60] > > > > * 10.58:23 failctoun for db_mgmt is increased ---> which shut down the > > resource group > > (that's how i confihgured it) > > > > kind regards Max > > > > > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Max Hofer APUS Software G.m.b.H. A-8074 Raaba, Bahnhofstraße 1/1 T| +43 316 401629 11 F| +43 316 401629 9 W| www.apus.co.at E| [EMAIL PROTECTED] _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
