Hi Ceri,

>From the first node xCid, I could ping to quorum server Auron

root at xCid:/var/adm# ping Auron
Auron is alive
root at xCid:/var/adm#

And from Auron (quorum server), I could ping to host xCid
root at Auron:~# ping xCid
xCid is alive
root at Auron:~#

Quorum server (Auron) network is having the same subnet with cluster  nodes 
(xCid and xCloud) subnet 203.

root at Auron:~# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 
index 1
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 172.22.203.52 netmask ffffff00 broadcast 172.22.203.255
        ether 0:14:4f:2c:b7:28
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 
index 1
        inet6 ::1/128
root at Auron:~#

I am at LSI and doing testing on OpenSolaris, so I need to try out for both 
quorum server and quorum disk.  But quorum server doesn't work for me yet, so I 
still not try on quorum disk yet.

Thanks a lot for your help.

Janey


-----Original Message-----
From: Ceri Davies [mailto:c...@submonkey.net]
Sent: Friday, September 04, 2009 3:39 PM
To: Le, Janey
Cc: ha-clusters-discuss at opensolaris.org
Subject: Re: [ha-clusters-discuss] Host panic - OpenSolaris SunCluster

This looks just like a quorum problem.

Can you describe what your network to the quorum server is?
Is it only reachable via node xCloud?

Ceri

On Fri, Sep 04, 2009 at 01:00:31PM -0700, Janey Le wrote:
> After setting up SunCluster on OpenSolaris, and when I reboot the second node 
> of the cluster, my first node panic.  Can you please let me know if there is 
> anyone that I can contact to know if this is setup issue or it is cluster bug?
>
> Below is the setup that I had:
>
> -     2x1 ( 2 OpenSolaris 2009.06 x86 hosts named xCid and xCloud connected 
> to one FC array)
> -     Created 32 volumes and mapped to the host group; under the host groups 
> are the 2 nodes cluster
> -     Format the volumes
> -     Setup cluster with quorum server named Auron (all 2 nodes joined 
> cluster, all of the resource groups and resources are online on 1st node xCid)
>
> Below is the status of the cluster before rebooting the nodes.
> root at xCid:~# scstat -p
> ------------------------------------------------------------------
>
> -- Cluster Nodes --
>
>                     Node name           Status
>                     ---------           ------
>   Cluster node:     xCid                Online
>   Cluster node:     xCloud              Online
>
> ------------------------------------------------------------------
>
> -- Cluster Transport Paths --
>
>                     Endpoint               Endpoint               Status
>                     --------               --------               ------
>   Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
>   Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online
>
> ------------------------------------------------------------------
>
> -- Quorum Summary from latest node reconfiguration --
>
>   Quorum votes possible:      3
>   Quorum votes needed:        2
>   Quorum votes present:       3
>
>
> -- Quorum Votes by Node (current status) --
>
>                     Node Name           Present Possible Status
>                     ---------           ------- -------- ------
>   Node votes:       xCid                1        1       Online
>   Node votes:       xCloud              1        1       Online
>
>
> -- Quorum Votes by Device (current status) --
>
>                     Device Name         Present Possible Status
>                     -----------         ------- -------- ------
>   Device votes:     Auron               1        1       Online
>
> ------------------------------------------------------------------
>
> -- Device Group Servers --
>
>                          Device Group        Primary             Secondary
>                          ------------        -------             ---------
>
>
> -- Device Group Status --
>
>                               Device Group        Status
>                               ------------        ------
>
>
> -- Multi-owner Device Groups --
>
>                               Device Group        Online Status
>                               ------------        -------------
>
> ------------------------------------------------------------------
>
> -- Resource Groups and Resources --
>
>             Group Name     Resources
>             ----------     ---------
>  Resources: xCloud-rg      xCloud-nfsres r-nfs
>  Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs
>
>
> -- Resource Groups --
>
>             Group Name     Node Name                State          Suspended
>             ----------     ---------                -----          ---------
>      Group: xCloud-rg      xCid                     Online         No
>      Group: xCloud-rg      xCloud                   Offline        No
>
>      Group: nfs-rg         xCid                     Online         No
>      Group: nfs-rg         xCloud                   Offline        No
>
>
> -- Resources --
>
>             Resource Name  Node Name                State          Status 
> Message
>             -------------  ---------                -----          
> --------------
>   Resource: xCloud-nfsres  xCid                     Online         Online - 
> LogicalHostname online.
>   Resource: xCloud-nfsres  xCloud                   Offline        Offline
>
>   Resource: r-nfs          xCid                     Online         Online - 
> Service is online.
>   Resource: r-nfs          xCloud                   Offline        Offline
>
>   Resource: nfs-lh-rs      xCid                     Online         Online - 
> LogicalHostname online.
>   Resource: nfs-lh-rs      xCloud                   Offline        Offline
>
>   Resource: nfs-hastp-rs   xCid                     Online         Online
>   Resource: nfs-hastp-rs   xCloud                   Offline        Offline
>
>   Resource: nfs-rs         xCid                     Online         Online - 
> Service is online.
>   Resource: nfs-rs         xCloud                   Offline        Offline
>
> ------------------------------------------------------------------
>
> -- IPMP Groups --
>
>               Node Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
>   IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online
>
>   IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online
>
>
> -- IPMP Groups in Zones --
>
>               Zone Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
> ------------------------------------------------------------------
> root at xCid:~#
>
>
> root at xCid:~# clnode show
>
> === Cluster Nodes ===
>
> Node Name:                                      xCid
>   Node ID:                                         1
>   Enabled:                                         yes
>   privatehostname:                                 clusternode1-priv
>   reboot_on_path_failure:                          disabled
>   globalzoneshares:                                1
>   defaultpsetmin:                                  1
>   quorum_vote:                                     1
>   quorum_defaultvote:                              1
>   quorum_resv_key:                                 0x4A9B35C600000001
>   Transport Adapter List:                          e1000g2, e1000g3
>
> Node Name:                                      xCloud
>   Node ID:                                         2
>   Enabled:                                         yes
>   privatehostname:                                 clusternode2-priv
>   reboot_on_path_failure:                          disabled
>   globalzoneshares:                                1
>   defaultpsetmin:                                  1
>   quorum_vote:                                     1
>   quorum_defaultvote:                              1
>   quorum_resv_key:                                 0x4A9B35C600000002
>   Transport Adapter List:                          e1000g2, e1000g3
>
> root at xCid:~#
>
>
> ******  Reboot 1st node xCid, all of the resources transfer to 2nd node 
> xCloud  and online on node xCloud  ************
>
> root at xCloud:~# scstat -p
> ------------------------------------------------------------------
>
> -- Cluster Nodes --
>
>                     Node name           Status
>                     ---------           ------
>   Cluster node:     xCid                Online
>   Cluster node:     xCloud              Online
>
> ------------------------------------------------------------------
>
> -- Cluster Transport Paths --
>
>                     Endpoint               Endpoint               Status
>                     --------               --------               ------
>   Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
>   Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online
>
> ------------------------------------------------------------------
>
> -- Quorum Summary from latest node reconfiguration --
>
>   Quorum votes possible:      3
>   Quorum votes needed:        2
>   Quorum votes present:       3
>
>
> -- Quorum Votes by Node (current status) --
>
>                     Node Name           Present Possible Status
>                     ---------           ------- -------- ------
>   Node votes:       xCid                1        1       Online
>   Node votes:       xCloud              1        1       Online
>
>
> -- Quorum Votes by Device (current status) --
>
>                     Device Name         Present Possible Status
>                     -----------         ------- -------- ------
>   Device votes:     Auron               1        1       Online
>
> ------------------------------------------------------------------
>
> -- Device Group Servers --
>
>                          Device Group        Primary             Secondary
>                          ------------        -------             ---------
>
>
> -- Device Group Status --
>
>                               Device Group        Status
>                               ------------        ------
>
>
> -- Multi-owner Device Groups --
>
>                               Device Group        Online Status
>                               ------------        -------------
>
> ------------------------------------------------------------------
>
> -- Resource Groups and Resources --
>
>             Group Name     Resources
>             ----------     ---------
>  Resources: xCloud-rg      xCloud-nfsres r-nfs
>  Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs
>
>
> -- Resource Groups --
>
>             Group Name     Node Name                State          Suspended
>             ----------     ---------                -----          ---------
>      Group: xCloud-rg      xCid                     Offline        No
>      Group: xCloud-rg      xCloud                   Online         No
>
>      Group: nfs-rg         xCid                     Offline        No
>      Group: nfs-rg         xCloud                   Online         No
>
>
> -- Resources --
>
>             Resource Name  Node Name                State          Status 
> Message
>             -------------  ---------                -----          
> --------------
>   Resource: xCloud-nfsres  xCid                     Offline        Offline
>   Resource: xCloud-nfsres  xCloud                   Online         Online - 
> LogicalHostname online.
>
>   Resource: r-nfs          xCid                     Offline        Offline
>   Resource: r-nfs          xCloud                   Online         Online - 
> Service is online.
>
>   Resource: nfs-lh-rs      xCid                     Offline        Offline
>   Resource: nfs-lh-rs      xCloud                   Online         Online - 
> LogicalHostname online.
>
>   Resource: nfs-hastp-rs   xCid                     Offline        Offline
>   Resource: nfs-hastp-rs   xCloud                   Online         Online
>
>   Resource: nfs-rs         xCid                     Offline        Offline
>   Resource: nfs-rs         xCloud                   Online         Online - 
> Service is online.
>
> ------------------------------------------------------------------
>
> -- IPMP Groups --
>
>               Node Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
>   IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online
>
>   IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online
>
>
> -- IPMP Groups in Zones --
>
>               Zone Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
> ------------------------------------------------------------------
> root at xCloud:~#
>
>
> ***********Wait for about 5 minutes, then reboot 2nd node xCloud and  node 
> xCid panic with the error below *********************
>
> root at xCid:~# Notifying cluster that this node is panicking
> WARNING: CMM: Reading reservation keys from quorum device Auron failed with 
> error 2.
>
> panic[cpu0]/thread=ffffff02d0a623c0: CMM: Cluster lost operational quorum; 
> aborting.
>
> ffffff0011976b50 genunix:vcmn_err+2c ()
> ffffff0011976b60 
> cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+1f
>  ()
> ffffff0011976c40 
> cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+8c 
> ()
> ffffff0011976e30 
> cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+57f
>  ()
> ffffff0011976e70 cl_haci:__1cIcmm_implStransitions_thread6M_v_+b7 ()
> ffffff0011976e80 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+9 ()
> ffffff0011976ed0 cl_orb:cllwpwrapper+d7 ()
> ffffff0011976ee0 unix:thread_start+8 ()
>
> syncing file systems... done
> dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
>  51% done[2mMIdoOe
>
> the host log is attched.
>
> I have gone thru the SunCluster doc  on how to setup SunCluster for 
> OpenSolaris multiple times, but I don???t see any steps that I miss.  Can you 
> please help to see if this is setup issue or it is a bug?
>
> Thanks,
>
> Janey
> --
> This message posted from opensolaris.org


> _______________________________________________
> ha-clusters-discuss mailing list
> ha-clusters-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss


--
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

Reply via email to