After setting up SunCluster on OpenSolaris, and when I reboot the second node of the cluster, my first node panic. Can you please let me know if there is anyone that I can contact to know if this is setup issue or it is cluster bug?
Below is the setup that I had: - 2x1 ( 2 OpenSolaris 2009.06 x86 hosts named xCid and xCloud connected to one FC array) - Created 32 volumes and mapped to the host group; under the host groups are the 2 nodes cluster - Format the volumes - Setup cluster with quorum server named Auron (all 2 nodes joined cluster, all of the resource groups and resources are online on 1st node xCid) Below is the status of the cluster before rebooting the nodes. root at xCid:~# scstat -p ------------------------------------------------------------------ -- Cluster Nodes -- Node name Status --------- ------ Cluster node: xCid Online Cluster node: xCloud Online ------------------------------------------------------------------ -- Cluster Transport Paths -- Endpoint Endpoint Status -------- -------- ------ Transport path: xCid:e1000g3 xCloud:e1000g3 Path online Transport path: xCid:e1000g2 xCloud:e1000g2 Path online ------------------------------------------------------------------ -- Quorum Summary from latest node reconfiguration -- Quorum votes possible: 3 Quorum votes needed: 2 Quorum votes present: 3 -- Quorum Votes by Node (current status) -- Node Name Present Possible Status --------- ------- -------- ------ Node votes: xCid 1 1 Online Node votes: xCloud 1 1 Online -- Quorum Votes by Device (current status) -- Device Name Present Possible Status ----------- ------- -------- ------ Device votes: Auron 1 1 Online ------------------------------------------------------------------ -- Device Group Servers -- Device Group Primary Secondary ------------ ------- --------- -- Device Group Status -- Device Group Status ------------ ------ -- Multi-owner Device Groups -- Device Group Online Status ------------ ------------- ------------------------------------------------------------------ -- Resource Groups and Resources -- Group Name Resources ---------- --------- Resources: xCloud-rg xCloud-nfsres r-nfs Resources: nfs-rg nfs-lh-rs nfs-hastp-rs nfs-rs -- Resource Groups -- Group Name Node Name State Suspended ---------- --------- ----- --------- Group: xCloud-rg xCid Online No Group: xCloud-rg xCloud Offline No Group: nfs-rg xCid Online No Group: nfs-rg xCloud Offline No -- Resources -- Resource Name Node Name State Status Message ------------- --------- ----- -------------- Resource: xCloud-nfsres xCid Online Online - LogicalHostname online. Resource: xCloud-nfsres xCloud Offline Offline Resource: r-nfs xCid Online Online - Service is online. Resource: r-nfs xCloud Offline Offline Resource: nfs-lh-rs xCid Online Online - LogicalHostname online. Resource: nfs-lh-rs xCloud Offline Offline Resource: nfs-hastp-rs xCid Online Online Resource: nfs-hastp-rs xCloud Offline Offline Resource: nfs-rs xCid Online Online - Service is online. Resource: nfs-rs xCloud Offline Offline ------------------------------------------------------------------ -- IPMP Groups -- Node Name Group Status Adapter Status --------- ----- ------ ------- ------ IPMP Group: xCid sc_ipmp0 Online e1000g1 Online IPMP Group: xCloud sc_ipmp0 Online e1000g0 Online -- IPMP Groups in Zones -- Zone Name Group Status Adapter Status --------- ----- ------ ------- ------ ------------------------------------------------------------------ root at xCid:~# root at xCid:~# clnode show === Cluster Nodes === Node Name: xCid Node ID: 1 Enabled: yes privatehostname: clusternode1-priv reboot_on_path_failure: disabled globalzoneshares: 1 defaultpsetmin: 1 quorum_vote: 1 quorum_defaultvote: 1 quorum_resv_key: 0x4A9B35C600000001 Transport Adapter List: e1000g2, e1000g3 Node Name: xCloud Node ID: 2 Enabled: yes privatehostname: clusternode2-priv reboot_on_path_failure: disabled globalzoneshares: 1 defaultpsetmin: 1 quorum_vote: 1 quorum_defaultvote: 1 quorum_resv_key: 0x4A9B35C600000002 Transport Adapter List: e1000g2, e1000g3 root at xCid:~# ****** Reboot 1st node xCid, all of the resources transfer to 2nd node xCloud and online on node xCloud ************ root at xCloud:~# scstat -p ------------------------------------------------------------------ -- Cluster Nodes -- Node name Status --------- ------ Cluster node: xCid Online Cluster node: xCloud Online ------------------------------------------------------------------ -- Cluster Transport Paths -- Endpoint Endpoint Status -------- -------- ------ Transport path: xCid:e1000g3 xCloud:e1000g3 Path online Transport path: xCid:e1000g2 xCloud:e1000g2 Path online ------------------------------------------------------------------ -- Quorum Summary from latest node reconfiguration -- Quorum votes possible: 3 Quorum votes needed: 2 Quorum votes present: 3 -- Quorum Votes by Node (current status) -- Node Name Present Possible Status --------- ------- -------- ------ Node votes: xCid 1 1 Online Node votes: xCloud 1 1 Online -- Quorum Votes by Device (current status) -- Device Name Present Possible Status ----------- ------- -------- ------ Device votes: Auron 1 1 Online ------------------------------------------------------------------ -- Device Group Servers -- Device Group Primary Secondary ------------ ------- --------- -- Device Group Status -- Device Group Status ------------ ------ -- Multi-owner Device Groups -- Device Group Online Status ------------ ------------- ------------------------------------------------------------------ -- Resource Groups and Resources -- Group Name Resources ---------- --------- Resources: xCloud-rg xCloud-nfsres r-nfs Resources: nfs-rg nfs-lh-rs nfs-hastp-rs nfs-rs -- Resource Groups -- Group Name Node Name State Suspended ---------- --------- ----- --------- Group: xCloud-rg xCid Offline No Group: xCloud-rg xCloud Online No Group: nfs-rg xCid Offline No Group: nfs-rg xCloud Online No -- Resources -- Resource Name Node Name State Status Message ------------- --------- ----- -------------- Resource: xCloud-nfsres xCid Offline Offline Resource: xCloud-nfsres xCloud Online Online - LogicalHostname online. Resource: r-nfs xCid Offline Offline Resource: r-nfs xCloud Online Online - Service is online. Resource: nfs-lh-rs xCid Offline Offline Resource: nfs-lh-rs xCloud Online Online - LogicalHostname online. Resource: nfs-hastp-rs xCid Offline Offline Resource: nfs-hastp-rs xCloud Online Online Resource: nfs-rs xCid Offline Offline Resource: nfs-rs xCloud Online Online - Service is online. ------------------------------------------------------------------ -- IPMP Groups -- Node Name Group Status Adapter Status --------- ----- ------ ------- ------ IPMP Group: xCid sc_ipmp0 Online e1000g1 Online IPMP Group: xCloud sc_ipmp0 Online e1000g0 Online -- IPMP Groups in Zones -- Zone Name Group Status Adapter Status --------- ----- ------ ------- ------ ------------------------------------------------------------------ root at xCloud:~# ***********Wait for about 5 minutes, then reboot 2nd node xCloud and node xCid panic with the error below ********************* root at xCid:~# Notifying cluster that this node is panicking WARNING: CMM: Reading reservation keys from quorum device Auron failed with error 2. panic[cpu0]/thread=ffffff02d0a623c0: CMM: Cluster lost operational quorum; aborting. ffffff0011976b50 genunix:vcmn_err+2c () ffffff0011976b60 cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+1f () ffffff0011976c40 cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+8c () ffffff0011976e30 cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+57f () ffffff0011976e70 cl_haci:__1cIcmm_implStransitions_thread6M_v_+b7 () ffffff0011976e80 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+9 () ffffff0011976ed0 cl_orb:cllwpwrapper+d7 () ffffff0011976ee0 unix:thread_start+8 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 51% done[2mMIdoOe the host log is attched. I have gone thru the SunCluster doc on how to setup SunCluster for OpenSolaris multiple times, but I don?t see any steps that I miss. Can you please help to see if this is setup issue or it is a bug? Thanks, Janey -- This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: xCid_message Type: application/octet-stream Size: 169719 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090904/cac9235f/attachment-0001.obj>