Hi Ceri,
>From the first node xCid, I could ping to quorum server Auron
root at xCid:/var/adm# ping Auron
Auron is alive
root at xCid:/var/adm#
And from Auron (quorum server), I could ping to host xCid
root at Auron:~# ping xCid
xCid is alive
root at Auron:~#
Quorum server (Auron) network is having the same subnet with cluster nodes
(xCid and xCloud) subnet 203.
root at Auron:~# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 172.22.203.52 netmask ffffff00 broadcast 172.22.203.255
ether 0:14:4f:2c:b7:28
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252
index 1
inet6 ::1/128
root at Auron:~#
I am at LSI and doing testing on OpenSolaris, so I need to try out for both
quorum server and quorum disk. But quorum server doesn't work for me yet, so I
still not try on quorum disk yet.
Thanks a lot for your help.
Janey
-----Original Message-----
From: Ceri Davies [mailto:[email protected]]
Sent: Friday, September 04, 2009 3:39 PM
To: Le, Janey
Cc: ha-clusters-discuss at opensolaris.org
Subject: Re: [ha-clusters-discuss] Host panic - OpenSolaris SunCluster
This looks just like a quorum problem.
Can you describe what your network to the quorum server is?
Is it only reachable via node xCloud?
Ceri
On Fri, Sep 04, 2009 at 01:00:31PM -0700, Janey Le wrote:
> After setting up SunCluster on OpenSolaris, and when I reboot the second node
> of the cluster, my first node panic. Can you please let me know if there is
> anyone that I can contact to know if this is setup issue or it is cluster bug?
>
> Below is the setup that I had:
>
> - 2x1 ( 2 OpenSolaris 2009.06 x86 hosts named xCid and xCloud connected
> to one FC array)
> - Created 32 volumes and mapped to the host group; under the host groups
> are the 2 nodes cluster
> - Format the volumes
> - Setup cluster with quorum server named Auron (all 2 nodes joined
> cluster, all of the resource groups and resources are online on 1st node xCid)
>
> Below is the status of the cluster before rebooting the nodes.
> root at xCid:~# scstat -p
> ------------------------------------------------------------------
>
> -- Cluster Nodes --
>
> Node name Status
> --------- ------
> Cluster node: xCid Online
> Cluster node: xCloud Online
>
> ------------------------------------------------------------------
>
> -- Cluster Transport Paths --
>
> Endpoint Endpoint Status
> -------- -------- ------
> Transport path: xCid:e1000g3 xCloud:e1000g3 Path online
> Transport path: xCid:e1000g2 xCloud:e1000g2 Path online
>
> ------------------------------------------------------------------
>
> -- Quorum Summary from latest node reconfiguration --
>
> Quorum votes possible: 3
> Quorum votes needed: 2
> Quorum votes present: 3
>
>
> -- Quorum Votes by Node (current status) --
>
> Node Name Present Possible Status
> --------- ------- -------- ------
> Node votes: xCid 1 1 Online
> Node votes: xCloud 1 1 Online
>
>
> -- Quorum Votes by Device (current status) --
>
> Device Name Present Possible Status
> ----------- ------- -------- ------
> Device votes: Auron 1 1 Online
>
> ------------------------------------------------------------------
>
> -- Device Group Servers --
>
> Device Group Primary Secondary
> ------------ ------- ---------
>
>
> -- Device Group Status --
>
> Device Group Status
> ------------ ------
>
>
> -- Multi-owner Device Groups --
>
> Device Group Online Status
> ------------ -------------
>
> ------------------------------------------------------------------
>
> -- Resource Groups and Resources --
>
> Group Name Resources
> ---------- ---------
> Resources: xCloud-rg xCloud-nfsres r-nfs
> Resources: nfs-rg nfs-lh-rs nfs-hastp-rs nfs-rs
>
>
> -- Resource Groups --
>
> Group Name Node Name State Suspended
> ---------- --------- ----- ---------
> Group: xCloud-rg xCid Online No
> Group: xCloud-rg xCloud Offline No
>
> Group: nfs-rg xCid Online No
> Group: nfs-rg xCloud Offline No
>
>
> -- Resources --
>
> Resource Name Node Name State Status
> Message
> ------------- --------- -----
> --------------
> Resource: xCloud-nfsres xCid Online Online -
> LogicalHostname online.
> Resource: xCloud-nfsres xCloud Offline Offline
>
> Resource: r-nfs xCid Online Online -
> Service is online.
> Resource: r-nfs xCloud Offline Offline
>
> Resource: nfs-lh-rs xCid Online Online -
> LogicalHostname online.
> Resource: nfs-lh-rs xCloud Offline Offline
>
> Resource: nfs-hastp-rs xCid Online Online
> Resource: nfs-hastp-rs xCloud Offline Offline
>
> Resource: nfs-rs xCid Online Online -
> Service is online.
> Resource: nfs-rs xCloud Offline Offline
>
> ------------------------------------------------------------------
>
> -- IPMP Groups --
>
> Node Name Group Status Adapter Status
> --------- ----- ------ ------- ------
> IPMP Group: xCid sc_ipmp0 Online e1000g1 Online
>
> IPMP Group: xCloud sc_ipmp0 Online e1000g0 Online
>
>
> -- IPMP Groups in Zones --
>
> Zone Name Group Status Adapter Status
> --------- ----- ------ ------- ------
> ------------------------------------------------------------------
> root at xCid:~#
>
>
> root at xCid:~# clnode show
>
> === Cluster Nodes ===
>
> Node Name: xCid
> Node ID: 1
> Enabled: yes
> privatehostname: clusternode1-priv
> reboot_on_path_failure: disabled
> globalzoneshares: 1
> defaultpsetmin: 1
> quorum_vote: 1
> quorum_defaultvote: 1
> quorum_resv_key: 0x4A9B35C600000001
> Transport Adapter List: e1000g2, e1000g3
>
> Node Name: xCloud
> Node ID: 2
> Enabled: yes
> privatehostname: clusternode2-priv
> reboot_on_path_failure: disabled
> globalzoneshares: 1
> defaultpsetmin: 1
> quorum_vote: 1
> quorum_defaultvote: 1
> quorum_resv_key: 0x4A9B35C600000002
> Transport Adapter List: e1000g2, e1000g3
>
> root at xCid:~#
>
>
> ****** Reboot 1st node xCid, all of the resources transfer to 2nd node
> xCloud and online on node xCloud ************
>
> root at xCloud:~# scstat -p
> ------------------------------------------------------------------
>
> -- Cluster Nodes --
>
> Node name Status
> --------- ------
> Cluster node: xCid Online
> Cluster node: xCloud Online
>
> ------------------------------------------------------------------
>
> -- Cluster Transport Paths --
>
> Endpoint Endpoint Status
> -------- -------- ------
> Transport path: xCid:e1000g3 xCloud:e1000g3 Path online
> Transport path: xCid:e1000g2 xCloud:e1000g2 Path online
>
> ------------------------------------------------------------------
>
> -- Quorum Summary from latest node reconfiguration --
>
> Quorum votes possible: 3
> Quorum votes needed: 2
> Quorum votes present: 3
>
>
> -- Quorum Votes by Node (current status) --
>
> Node Name Present Possible Status
> --------- ------- -------- ------
> Node votes: xCid 1 1 Online
> Node votes: xCloud 1 1 Online
>
>
> -- Quorum Votes by Device (current status) --
>
> Device Name Present Possible Status
> ----------- ------- -------- ------
> Device votes: Auron 1 1 Online
>
> ------------------------------------------------------------------
>
> -- Device Group Servers --
>
> Device Group Primary Secondary
> ------------ ------- ---------
>
>
> -- Device Group Status --
>
> Device Group Status
> ------------ ------
>
>
> -- Multi-owner Device Groups --
>
> Device Group Online Status
> ------------ -------------
>
> ------------------------------------------------------------------
>
> -- Resource Groups and Resources --
>
> Group Name Resources
> ---------- ---------
> Resources: xCloud-rg xCloud-nfsres r-nfs
> Resources: nfs-rg nfs-lh-rs nfs-hastp-rs nfs-rs
>
>
> -- Resource Groups --
>
> Group Name Node Name State Suspended
> ---------- --------- ----- ---------
> Group: xCloud-rg xCid Offline No
> Group: xCloud-rg xCloud Online No
>
> Group: nfs-rg xCid Offline No
> Group: nfs-rg xCloud Online No
>
>
> -- Resources --
>
> Resource Name Node Name State Status
> Message
> ------------- --------- -----
> --------------
> Resource: xCloud-nfsres xCid Offline Offline
> Resource: xCloud-nfsres xCloud Online Online -
> LogicalHostname online.
>
> Resource: r-nfs xCid Offline Offline
> Resource: r-nfs xCloud Online Online -
> Service is online.
>
> Resource: nfs-lh-rs xCid Offline Offline
> Resource: nfs-lh-rs xCloud Online Online -
> LogicalHostname online.
>
> Resource: nfs-hastp-rs xCid Offline Offline
> Resource: nfs-hastp-rs xCloud Online Online
>
> Resource: nfs-rs xCid Offline Offline
> Resource: nfs-rs xCloud Online Online -
> Service is online.
>
> ------------------------------------------------------------------
>
> -- IPMP Groups --
>
> Node Name Group Status Adapter Status
> --------- ----- ------ ------- ------
> IPMP Group: xCid sc_ipmp0 Online e1000g1 Online
>
> IPMP Group: xCloud sc_ipmp0 Online e1000g0 Online
>
>
> -- IPMP Groups in Zones --
>
> Zone Name Group Status Adapter Status
> --------- ----- ------ ------- ------
> ------------------------------------------------------------------
> root at xCloud:~#
>
>
> ***********Wait for about 5 minutes, then reboot 2nd node xCloud and node
> xCid panic with the error below *********************
>
> root at xCid:~# Notifying cluster that this node is panicking
> WARNING: CMM: Reading reservation keys from quorum device Auron failed with
> error 2.
>
> panic[cpu0]/thread=ffffff02d0a623c0: CMM: Cluster lost operational quorum;
> aborting.
>
> ffffff0011976b50 genunix:vcmn_err+2c ()
> ffffff0011976b60
> cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+1f
> ()
> ffffff0011976c40
> cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+8c
> ()
> ffffff0011976e30
> cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+57f
> ()
> ffffff0011976e70 cl_haci:__1cIcmm_implStransitions_thread6M_v_+b7 ()
> ffffff0011976e80 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+9 ()
> ffffff0011976ed0 cl_orb:cllwpwrapper+d7 ()
> ffffff0011976ee0 unix:thread_start+8 ()
>
> syncing file systems... done
> dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
> 51% done[2mMIdoOe
>
> the host log is attched.
>
> I have gone thru the SunCluster doc on how to setup SunCluster for
> OpenSolaris multiple times, but I don???t see any steps that I miss. Can you
> please help to see if this is setup issue or it is a bug?
>
> Thanks,
>
> Janey
> --
> This message posted from opensolaris.org
> _______________________________________________
> ha-clusters-discuss mailing list
> ha-clusters-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
--
That must be wonderful! I don't understand it at all.
-- Moliere