This looks just like a quorum problem.

Can you describe what your network to the quorum server is?
Is it only reachable via node xCloud?

Ceri

On Fri, Sep 04, 2009 at 01:00:31PM -0700, Janey Le wrote:
> After setting up SunCluster on OpenSolaris, and when I reboot the second node 
> of the cluster, my first node panic.  Can you please let me know if there is 
> anyone that I can contact to know if this is setup issue or it is cluster 
> bug? 
> 
> Below is the setup that I had:
> 
> -     2x1 ( 2 OpenSolaris 2009.06 x86 hosts named xCid and xCloud connected 
> to one FC array)
> -     Created 32 volumes and mapped to the host group; under the host groups 
> are the 2 nodes cluster
> -     Format the volumes
> -     Setup cluster with quorum server named Auron (all 2 nodes joined 
> cluster, all of the resource groups and resources are online on 1st node xCid)
> 
> Below is the status of the cluster before rebooting the nodes.
> root at xCid:~# scstat -p
> ------------------------------------------------------------------
> 
> -- Cluster Nodes --
> 
>                     Node name           Status
>                     ---------           ------
>   Cluster node:     xCid                Online
>   Cluster node:     xCloud              Online
> 
> ------------------------------------------------------------------
> 
> -- Cluster Transport Paths --
> 
>                     Endpoint               Endpoint               Status
>                     --------               --------               ------
>   Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
>   Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online
> 
> ------------------------------------------------------------------
> 
> -- Quorum Summary from latest node reconfiguration --
> 
>   Quorum votes possible:      3
>   Quorum votes needed:        2
>   Quorum votes present:       3
> 
> 
> -- Quorum Votes by Node (current status) --
> 
>                     Node Name           Present Possible Status
>                     ---------           ------- -------- ------
>   Node votes:       xCid                1        1       Online
>   Node votes:       xCloud              1        1       Online
> 
> 
> -- Quorum Votes by Device (current status) --
> 
>                     Device Name         Present Possible Status
>                     -----------         ------- -------- ------
>   Device votes:     Auron               1        1       Online
> 
> ------------------------------------------------------------------
> 
> -- Device Group Servers --
> 
>                          Device Group        Primary             Secondary
>                          ------------        -------             ---------
> 
> 
> -- Device Group Status --
> 
>                               Device Group        Status              
>                               ------------        ------              
> 
> 
> -- Multi-owner Device Groups --
> 
>                               Device Group        Online Status
>                               ------------        -------------
> 
> ------------------------------------------------------------------
> 
> -- Resource Groups and Resources --
> 
>             Group Name     Resources
>             ----------     ---------
>  Resources: xCloud-rg      xCloud-nfsres r-nfs
>  Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs
> 
> 
> -- Resource Groups --
> 
>             Group Name     Node Name                State          Suspended
>             ----------     ---------                -----          ---------
>      Group: xCloud-rg      xCid                     Online         No
>      Group: xCloud-rg      xCloud                   Offline        No
> 
>      Group: nfs-rg         xCid                     Online         No
>      Group: nfs-rg         xCloud                   Offline        No
> 
> 
> -- Resources --
> 
>             Resource Name  Node Name                State          Status 
> Message
>             -------------  ---------                -----          
> --------------
>   Resource: xCloud-nfsres  xCid                     Online         Online - 
> LogicalHostname online.
>   Resource: xCloud-nfsres  xCloud                   Offline        Offline
> 
>   Resource: r-nfs          xCid                     Online         Online - 
> Service is online.
>   Resource: r-nfs          xCloud                   Offline        Offline
> 
>   Resource: nfs-lh-rs      xCid                     Online         Online - 
> LogicalHostname online.
>   Resource: nfs-lh-rs      xCloud                   Offline        Offline
> 
>   Resource: nfs-hastp-rs   xCid                     Online         Online
>   Resource: nfs-hastp-rs   xCloud                   Offline        Offline
> 
>   Resource: nfs-rs         xCid                     Online         Online - 
> Service is online.
>   Resource: nfs-rs         xCloud                   Offline        Offline
> 
> ------------------------------------------------------------------
> 
> -- IPMP Groups --
> 
>               Node Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
>   IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online
> 
>   IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online
> 
> 
> -- IPMP Groups in Zones --
> 
>               Zone Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
> ------------------------------------------------------------------
> root at xCid:~# 
> 
> 
> root at xCid:~# clnode show
> 
> === Cluster Nodes ===                          
> 
> Node Name:                                      xCid
>   Node ID:                                         1
>   Enabled:                                         yes
>   privatehostname:                                 clusternode1-priv
>   reboot_on_path_failure:                          disabled
>   globalzoneshares:                                1
>   defaultpsetmin:                                  1
>   quorum_vote:                                     1
>   quorum_defaultvote:                              1
>   quorum_resv_key:                                 0x4A9B35C600000001
>   Transport Adapter List:                          e1000g2, e1000g3
> 
> Node Name:                                      xCloud
>   Node ID:                                         2
>   Enabled:                                         yes
>   privatehostname:                                 clusternode2-priv
>   reboot_on_path_failure:                          disabled
>   globalzoneshares:                                1
>   defaultpsetmin:                                  1
>   quorum_vote:                                     1
>   quorum_defaultvote:                              1
>   quorum_resv_key:                                 0x4A9B35C600000002
>   Transport Adapter List:                          e1000g2, e1000g3
> 
> root at xCid:~#
> 
> 
> ******  Reboot 1st node xCid, all of the resources transfer to 2nd node 
> xCloud  and online on node xCloud  ************
> 
> root at xCloud:~# scstat -p
> ------------------------------------------------------------------
> 
> -- Cluster Nodes --
> 
>                     Node name           Status
>                     ---------           ------
>   Cluster node:     xCid                Online
>   Cluster node:     xCloud              Online
> 
> ------------------------------------------------------------------
> 
> -- Cluster Transport Paths --
> 
>                     Endpoint               Endpoint               Status
>                     --------               --------               ------
>   Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
>   Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online
> 
> ------------------------------------------------------------------
> 
> -- Quorum Summary from latest node reconfiguration --
> 
>   Quorum votes possible:      3
>   Quorum votes needed:        2
>   Quorum votes present:       3
> 
> 
> -- Quorum Votes by Node (current status) --
> 
>                     Node Name           Present Possible Status
>                     ---------           ------- -------- ------
>   Node votes:       xCid                1        1       Online
>   Node votes:       xCloud              1        1       Online
> 
> 
> -- Quorum Votes by Device (current status) --
> 
>                     Device Name         Present Possible Status
>                     -----------         ------- -------- ------
>   Device votes:     Auron               1        1       Online
> 
> ------------------------------------------------------------------
> 
> -- Device Group Servers --
> 
>                          Device Group        Primary             Secondary
>                          ------------        -------             ---------
> 
> 
> -- Device Group Status --
> 
>                               Device Group        Status              
>                               ------------        ------              
> 
> 
> -- Multi-owner Device Groups --
> 
>                               Device Group        Online Status
>                               ------------        -------------
> 
> ------------------------------------------------------------------
> 
> -- Resource Groups and Resources --
> 
>             Group Name     Resources
>             ----------     ---------
>  Resources: xCloud-rg      xCloud-nfsres r-nfs
>  Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs
> 
> 
> -- Resource Groups --
> 
>             Group Name     Node Name                State          Suspended
>             ----------     ---------                -----          ---------
>      Group: xCloud-rg      xCid                     Offline        No
>      Group: xCloud-rg      xCloud                   Online         No
> 
>      Group: nfs-rg         xCid                     Offline        No
>      Group: nfs-rg         xCloud                   Online         No
> 
> 
> -- Resources --
> 
>             Resource Name  Node Name                State          Status 
> Message
>             -------------  ---------                -----          
> --------------
>   Resource: xCloud-nfsres  xCid                     Offline        Offline
>   Resource: xCloud-nfsres  xCloud                   Online         Online - 
> LogicalHostname online.
> 
>   Resource: r-nfs          xCid                     Offline        Offline
>   Resource: r-nfs          xCloud                   Online         Online - 
> Service is online.
> 
>   Resource: nfs-lh-rs      xCid                     Offline        Offline
>   Resource: nfs-lh-rs      xCloud                   Online         Online - 
> LogicalHostname online.
> 
>   Resource: nfs-hastp-rs   xCid                     Offline        Offline
>   Resource: nfs-hastp-rs   xCloud                   Online         Online
> 
>   Resource: nfs-rs         xCid                     Offline        Offline
>   Resource: nfs-rs         xCloud                   Online         Online - 
> Service is online.
> 
> ------------------------------------------------------------------
> 
> -- IPMP Groups --
> 
>               Node Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
>   IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online
> 
>   IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online
> 
> 
> -- IPMP Groups in Zones --
> 
>               Zone Name           Group   Status         Adapter   Status
>               ---------           -----   ------         -------   ------
> ------------------------------------------------------------------
> root at xCloud:~# 
> 
>  
> ***********Wait for about 5 minutes, then reboot 2nd node xCloud and  node 
> xCid panic with the error below *********************
> 
> root at xCid:~# Notifying cluster that this node is panicking
> WARNING: CMM: Reading reservation keys from quorum device Auron failed with 
> error 2.
> 
> panic[cpu0]/thread=ffffff02d0a623c0: CMM: Cluster lost operational quorum; 
> aborting.
> 
> ffffff0011976b50 genunix:vcmn_err+2c ()
> ffffff0011976b60 
> cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+1f
>  ()
> ffffff0011976c40 
> cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+8c 
> ()
> ffffff0011976e30 
> cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+57f
>  ()
> ffffff0011976e70 cl_haci:__1cIcmm_implStransitions_thread6M_v_+b7 ()
> ffffff0011976e80 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+9 ()
> ffffff0011976ed0 cl_orb:cllwpwrapper+d7 ()
> ffffff0011976ee0 unix:thread_start+8 ()
> 
> syncing file systems... done
> dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
>  51% done[2mMIdoOe
> 
> the host log is attched.
> 
> I have gone thru the SunCluster doc  on how to setup SunCluster for 
> OpenSolaris multiple times, but I don???t see any steps that I miss.  Can you 
> please help to see if this is setup issue or it is a bug?
> 
> Thanks,
> 
> Janey
> -- 
> This message posted from opensolaris.org


> _______________________________________________
> ha-clusters-discuss mailing list
> ha-clusters-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss


-- 
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

Reply via email to