The problem was: #0 0xb7eaa4fd in group_color () from /usr/lib/libpengine.so.3 #1 0xb7e9ce57 in stage5 () from /usr/lib/libpengine.so.3 #2 0xb7e9c284 in do_calculations () from /usr/lib/libpengine.so.3 #3 0xb7e9c733 in process_pe_message () from /usr/lib/libpengine.so.3 #4 0xb7ed87c1 in subsystem_msg_dispatch () from /usr/lib/libcrmcommon.so.1 #5 0xb7f153b6 in G_CH_dispatch_int () from /usr/lib/libplumb.so.1 #6 0xb7e30df2 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 This problem was caused by me removing the resources from a group. There was also a constraint on the group left in the cib. This caused pengine to die a horrid death. I consider it's a bug that one is allows to do this if the pengine is not able to survive the changes I make with cibadmin. Or it's a bug that cibadmin allows me to do nonsupported changes. BR Robert Lindgren
On 9/18/07, Robert Lindgren <[EMAIL PROTECTED] > wrote: > > Hi All, > > I have a two node cluster with two groups, one running mysql and one > running samba. I stopped the samba group and tried to remove it with > cibadmin > > cibadmin -D -o resources -X '<primitive id="R_fs_samba" class="ocf" > type="Filesystem" provider="heartbeat">' > cibadmin -D -o resources -X '<primitive id="R_samba" class="heartbeat" > type="samba" provider="heartbeat">' > cibadmin -D -o resources -X '<primitive id="R_drbd_samba" > class="heartbeat" type="drbddisk" provider="heartbeat">' > cibadmin -D -o resources -X '<primitive class="ocf" type="IPaddr2" > provider="heartbeat" id="R_192.168.12.196">' > > and now this happens all the time in the log: > > pengine[24820]: 2007/09/18_09:06:41 WARN: group_unpack: Group G_samba did > not have any children > pengine[24820]: 2007/09/18_09:06:41 info: determine_online_status: Node > noemic1 is online > pengine[24820]: 2007/09/18_09:06:41 info: group_print: Resource Group: > G_mysql > pengine[24820]: 2007/09/18_09:06:41 info: native_print: R_192.168.12.197 > (heartbeat::ocf:IPaddr2): Stopped > pengine[24820]: 2007/09/18_09:06:41 info: native_print: R_drbd_mysql > (heartbeat:drbddisk): Stopped > pengine[24820]: 2007/09/18_09:06:41 info: native_print: R_fs > (heartbeat::ocf:Filesystem): Stopped > pengine[24820]: 2007/09/18_09:06:41 info: native_print: R_mysql > (lsb:mysql): Stopped > pengine[24820]: 2007/09/18_09:06:41 info: group_print: Resource Group: > G_samba > pengine[24820]: 2007/09/18_09:06:41 notice: StartRsc: noemic1 Start > R_192.168.12.197 > pengine[24820]: 2007/09/18_09:06:41 notice: StartRsc: noemic1 Start > R_drbd_mysql > pengine[24820]: 2007/09/18_09:06:41 notice: StartRsc: noemic1 Start R_fs > pengine[24820]: 2007/09/18_09:06:41 notice: StartRsc: noemic1 Start > R_mysql > crmd[24449]: 2007/09/18_09:06:41 WARN: Exiting pengine process 24820 > killed by signal 11 [SIGSEGV - Segmentation violation]. > crmd[24449]: 2007/09/18_09:06:41 ERROR: Exiting pengine process 24820 > dumped core > crmd[24449]: 2007/09/18_09:06:41 info: crmdManagedChildDied: Process > pengine:[24820] exited (signal=11, exitcode=0) > crmd[24449]: 2007/09/18_09:06:41 ERROR: crmdManagedChildDied: The pengine > subsystem terminated unexpectedly > crmd[24449]: 2007/09/18_09:06:41 ERROR: do_log: [[FSA]] Input I_ERROR from > crmdManagedChildDied() received in state (S_POLICY_ENGINE) > > Are there any way to know what's the problem with pengine? Running > heartbeat_2.1.2-2_i386.deb for Ubuntu from Opensuse build service. > > Cheers > > Robert Lindgren > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
