Simple 2 node active / passive configuration; Oracle DB resource only starts on one node. 2nd node keeps failing to start, but Oracle DB will start manually. Any help is appreciated .
Ha.cf : use_logd on auto_failback off autojoin other traditional_compression false # debug 3 node lnxhat1 node lnxhat2 deadtime 10 keepalive 2 udpport 694 ucast hsi0 172.27.100.204 ucast hsi0 172.27.100.205 crm on ping 172.22.4.1 172.27.220.10 172.27.220.12 respawn root /usr/lib64/heartbeat/pingd -m 2000 -d 5s -a my_ping_set crm_verify[9452]: 2009/11/03_14:47:26 ERROR: unpack_rsc_op: Remapping resource_oracle_start_0 (rc=1) on lnxhat1 to an ERROR # ptest -LVVVVVV ptest[9519]: 2009/11/03_14:48:00 info: main: =#=#=#=#= Getting XML =#=#=#=#= ptest[9519]: 2009/11/03_14:48:00 info: main: Reading XML from: live cluster ptest[9519]: 2009/11/03_14:48:00 notice: main: Required feature set: 2.0 ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'stop' for cluster option 'no-quorum-policy' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'symmetric-cluster' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'reboot' for cluster option 'stonith-action' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '0' for cluster option 'default-resource-stickiness' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '0' for cluster option 'default-resource-failure-stickiness' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'is-managed-default' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '60s' for cluster option 'cluster-delay' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '30' for cluster option 'batch-limit' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '20s' for cluster option 'default-action-timeout' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'stop-orphan-resources' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'stop-orphan-actions' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'false' for cluster option 'remove-after-stop' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '-1' for cluster option 'pe-error-series-max' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '-1' for cluster option 'pe-warn-series-max' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value '-1' for cluster option 'pe-input-series-max' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'startup-fencing' ptest[9519]: 2009/11/03_14:48:00 debug: cluster_option: Using default value 'true' for cluster option 'start-failure-is-fatal' ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: Default action timeout: 20s ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: Default stickiness: 0 ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: Default failure stickiness: 0 ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: STONITH of failed nodes is enabled ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: Cluster is symmetric - resources can run anywhere by default ptest[9519]: 2009/11/03_14:48:00 debug: unpack_config: On loss of CCM Quorum: Stop ALL resources ptest[9519]: 2009/11/03_14:48:00 info: determine_online_status: Node lnxhat1 is online ptest[9519]: 2009/11/03_14:48:00 debug: common_apply_stickiness: fail-count-resource_oracle: INFINITY ptest[9519]: 2009/11/03_14:48:00 WARN: unpack_rsc_op: Processing failed op resource_listener_monitor_0 on lnxhat1: Timed Out ptest[9519]: 2009/11/03_14:48:00 ERROR: unpack_rsc_op: Remapping resource_oracle_start_0 (rc=1) on lnxhat1 to an ERROR ptest[9519]: 2009/11/03_14:48:00 WARN: unpack_rsc_op: Processing failed op resource_oracle_start_0 on lnxhat1: Error ptest[9519]: 2009/11/03_14:48:00 WARN: unpack_rsc_op: Compatability handling for failed op resource_oracle_start_0 on lnxhat1 ptest[9519]: 2009/11/03_14:48:00 info: determine_online_status: Node lnxhat2 is online ptest[9519]: 2009/11/03_14:48:00 WARN: unpack_rsc_op: Processing failed op resource_listener_monitor_0 on lnxhat2: Timed Out ptest[9519]: 2009/11/03_14:48:00 notice: group_print: Resource Group: rg_A ptest[9519]: 2009/11/03_14:48:00 notice: native_print: resource_IP (ocf::heartbeat:IPaddr): Started lnxhat2 ptest[9519]: 2009/11/03_14:48:00 notice: native_print: resource_oracle (ocf::heartbeat:oracle): Started lnxhat2 ptest[9519]: 2009/11/03_14:48:00 notice: native_print: resource_listener (ocf::heartbeat:oralsnr): Started lnxhat2 ptest[9519]: 2009/11/03_14:48:00 notice: clone_print: Clone Set: 1 ptest[9519]: 2009/11/03_14:48:00 notice: native_print: resource_stonith:0 (stonith:ssh): Stopped ptest[9519]: 2009/11/03_14:48:00 notice: native_print: resource_stonith:1 (stonith:ssh): Stopped ptest[9519]: 2009/11/03_14:48:00 notice: clone_print: Clone Set: ocfs2cloneset ptest[9519]: 2009/11/03_14:48:00 notice: native_print: ocfs2clone:0 (ocf::heartbeat:Filesystem): Started lnxhat1 ptest[9519]: 2009/11/03_14:48:00 notice: native_print: ocfs2clone:1 (ocf::heartbeat:Filesystem): Started lnxhat2 ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: Assigning lnxhat2 to resource_IP ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: Assigning lnxhat2 to resource_oracle ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: Assigning lnxhat2 to resource_listener ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: All nodes for resource resource_stonith:0 are unavailable, unclean or shutting down ptest[9519]: 2009/11/03_14:48:00 WARN: native_color: Resource resource_stonith:0 cannot run anywhere ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: All nodes for resource resource_stonith:1 are unavailable, unclean or shutting down ptest[9519]: 2009/11/03_14:48:00 WARN: native_color: Resource resource_stonith:1 cannot run anywhere ptest[9519]: 2009/11/03_14:48:00 debug: clone_color: Allocated 0 1 instances of a possible 2 ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: Assigning lnxhat1 to ocfs2clone:0 ptest[9519]: 2009/11/03_14:48:00 debug: native_assign_node: Assigning lnxhat2 to ocfs2clone:1 ptest[9519]: 2009/11/03_14:48:00 debug: clone_color: Allocated 2 ocfs2cloneset instances of a possible 2 ptest[9519]: 2009/11/03_14:48:00 notice: NoRoleChange: Leave resource resource_IP (lnxhat2) ptest[9519]: 2009/11/03_14:48:00 notice: NoRoleChange: Leave resource resource_oracle (lnxhat2) ptest[9519]: 2009/11/03_14:48:00 notice: NoRoleChange: Leave resource resource_listener (lnxhat2) ptest[9519]: 2009/11/03_14:48:00 debug: child_starting_constraints: 1 has no active children ptest[9519]: 2009/11/03_14:48:00 debug: child_stopping_constraints: 1 has no active children ptest[9519]: 2009/11/03_14:48:00 notice: NoRoleChange: Leave resource ocfs2clone:0 (lnxhat1) ptest[9519]: 2009/11/03_14:48:00 notice: NoRoleChange: Leave resource ocfs2clone:1 (lnxhat2) ***556*** [r...@lnxhat1:/var/lib/heartbeat/crm] *** uid=0 *** # ________________________________ The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
