After setting up SunCluster on OpenSolaris, and when I reboot the second node 
of the cluster, my first node panic.  Can you please let me know if there is 
anyone that I can contact to know if this is setup issue or it is cluster bug? 

Below is the setup that I had:

-       2x1 ( 2 OpenSolaris 2009.06 x86 hosts named xCid and xCloud connected 
to one FC array)
-       Created 32 volumes and mapped to the host group; under the host groups 
are the 2 nodes cluster
-       Format the volumes
-       Setup cluster with quorum server named Auron (all 2 nodes joined 
cluster, all of the resource groups and resources are online on 1st node xCid)

Below is the status of the cluster before rebooting the nodes.
root at xCid:~# scstat -p
------------------------------------------------------------------

-- Cluster Nodes --

                    Node name           Status
                    ---------           ------
  Cluster node:     xCid                Online
  Cluster node:     xCloud              Online

------------------------------------------------------------------

-- Cluster Transport Paths --

                    Endpoint               Endpoint               Status
                    --------               --------               ------
  Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
  Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online

------------------------------------------------------------------

-- Quorum Summary from latest node reconfiguration --

  Quorum votes possible:      3
  Quorum votes needed:        2
  Quorum votes present:       3


-- Quorum Votes by Node (current status) --

                    Node Name           Present Possible Status
                    ---------           ------- -------- ------
  Node votes:       xCid                1        1       Online
  Node votes:       xCloud              1        1       Online


-- Quorum Votes by Device (current status) --

                    Device Name         Present Possible Status
                    -----------         ------- -------- ------
  Device votes:     Auron               1        1       Online

------------------------------------------------------------------

-- Device Group Servers --

                         Device Group        Primary             Secondary
                         ------------        -------             ---------


-- Device Group Status --

                              Device Group        Status              
                              ------------        ------              


-- Multi-owner Device Groups --

                              Device Group        Online Status
                              ------------        -------------

------------------------------------------------------------------

-- Resource Groups and Resources --

            Group Name     Resources
            ----------     ---------
 Resources: xCloud-rg      xCloud-nfsres r-nfs
 Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs


-- Resource Groups --

            Group Name     Node Name                State          Suspended
            ----------     ---------                -----          ---------
     Group: xCloud-rg      xCid                     Online         No
     Group: xCloud-rg      xCloud                   Offline        No

     Group: nfs-rg         xCid                     Online         No
     Group: nfs-rg         xCloud                   Offline        No


-- Resources --

            Resource Name  Node Name                State          Status 
Message
            -------------  ---------                -----          
--------------
  Resource: xCloud-nfsres  xCid                     Online         Online - 
LogicalHostname online.
  Resource: xCloud-nfsres  xCloud                   Offline        Offline

  Resource: r-nfs          xCid                     Online         Online - 
Service is online.
  Resource: r-nfs          xCloud                   Offline        Offline

  Resource: nfs-lh-rs      xCid                     Online         Online - 
LogicalHostname online.
  Resource: nfs-lh-rs      xCloud                   Offline        Offline

  Resource: nfs-hastp-rs   xCid                     Online         Online
  Resource: nfs-hastp-rs   xCloud                   Offline        Offline

  Resource: nfs-rs         xCid                     Online         Online - 
Service is online.
  Resource: nfs-rs         xCloud                   Offline        Offline

------------------------------------------------------------------

-- IPMP Groups --

              Node Name           Group   Status         Adapter   Status
              ---------           -----   ------         -------   ------
  IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online

  IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online


-- IPMP Groups in Zones --

              Zone Name           Group   Status         Adapter   Status
              ---------           -----   ------         -------   ------
------------------------------------------------------------------
root at xCid:~# 


root at xCid:~# clnode show

=== Cluster Nodes ===                          

Node Name:                                      xCid
  Node ID:                                         1
  Enabled:                                         yes
  privatehostname:                                 clusternode1-priv
  reboot_on_path_failure:                          disabled
  globalzoneshares:                                1
  defaultpsetmin:                                  1
  quorum_vote:                                     1
  quorum_defaultvote:                              1
  quorum_resv_key:                                 0x4A9B35C600000001
  Transport Adapter List:                          e1000g2, e1000g3

Node Name:                                      xCloud
  Node ID:                                         2
  Enabled:                                         yes
  privatehostname:                                 clusternode2-priv
  reboot_on_path_failure:                          disabled
  globalzoneshares:                                1
  defaultpsetmin:                                  1
  quorum_vote:                                     1
  quorum_defaultvote:                              1
  quorum_resv_key:                                 0x4A9B35C600000002
  Transport Adapter List:                          e1000g2, e1000g3

root at xCid:~#


******  Reboot 1st node xCid, all of the resources transfer to 2nd node xCloud  
and online on node xCloud  ************

root at xCloud:~# scstat -p
------------------------------------------------------------------

-- Cluster Nodes --

                    Node name           Status
                    ---------           ------
  Cluster node:     xCid                Online
  Cluster node:     xCloud              Online

------------------------------------------------------------------

-- Cluster Transport Paths --

                    Endpoint               Endpoint               Status
                    --------               --------               ------
  Transport path:   xCid:e1000g3           xCloud:e1000g3         Path online
  Transport path:   xCid:e1000g2           xCloud:e1000g2         Path online

------------------------------------------------------------------

-- Quorum Summary from latest node reconfiguration --

  Quorum votes possible:      3
  Quorum votes needed:        2
  Quorum votes present:       3


-- Quorum Votes by Node (current status) --

                    Node Name           Present Possible Status
                    ---------           ------- -------- ------
  Node votes:       xCid                1        1       Online
  Node votes:       xCloud              1        1       Online


-- Quorum Votes by Device (current status) --

                    Device Name         Present Possible Status
                    -----------         ------- -------- ------
  Device votes:     Auron               1        1       Online

------------------------------------------------------------------

-- Device Group Servers --

                         Device Group        Primary             Secondary
                         ------------        -------             ---------


-- Device Group Status --

                              Device Group        Status              
                              ------------        ------              


-- Multi-owner Device Groups --

                              Device Group        Online Status
                              ------------        -------------

------------------------------------------------------------------

-- Resource Groups and Resources --

            Group Name     Resources
            ----------     ---------
 Resources: xCloud-rg      xCloud-nfsres r-nfs
 Resources: nfs-rg         nfs-lh-rs nfs-hastp-rs nfs-rs


-- Resource Groups --

            Group Name     Node Name                State          Suspended
            ----------     ---------                -----          ---------
     Group: xCloud-rg      xCid                     Offline        No
     Group: xCloud-rg      xCloud                   Online         No

     Group: nfs-rg         xCid                     Offline        No
     Group: nfs-rg         xCloud                   Online         No


-- Resources --

            Resource Name  Node Name                State          Status 
Message
            -------------  ---------                -----          
--------------
  Resource: xCloud-nfsres  xCid                     Offline        Offline
  Resource: xCloud-nfsres  xCloud                   Online         Online - 
LogicalHostname online.

  Resource: r-nfs          xCid                     Offline        Offline
  Resource: r-nfs          xCloud                   Online         Online - 
Service is online.

  Resource: nfs-lh-rs      xCid                     Offline        Offline
  Resource: nfs-lh-rs      xCloud                   Online         Online - 
LogicalHostname online.

  Resource: nfs-hastp-rs   xCid                     Offline        Offline
  Resource: nfs-hastp-rs   xCloud                   Online         Online

  Resource: nfs-rs         xCid                     Offline        Offline
  Resource: nfs-rs         xCloud                   Online         Online - 
Service is online.

------------------------------------------------------------------

-- IPMP Groups --

              Node Name           Group   Status         Adapter   Status
              ---------           -----   ------         -------   ------
  IPMP Group: xCid                sc_ipmp0 Online         e1000g1   Online

  IPMP Group: xCloud              sc_ipmp0 Online         e1000g0   Online


-- IPMP Groups in Zones --

              Zone Name           Group   Status         Adapter   Status
              ---------           -----   ------         -------   ------
------------------------------------------------------------------
root at xCloud:~# 

 
***********Wait for about 5 minutes, then reboot 2nd node xCloud and  node xCid 
panic with the error below *********************

root at xCid:~# Notifying cluster that this node is panicking
WARNING: CMM: Reading reservation keys from quorum device Auron failed with 
error 2.

panic[cpu0]/thread=ffffff02d0a623c0: CMM: Cluster lost operational quorum; 
aborting.

ffffff0011976b50 genunix:vcmn_err+2c ()
ffffff0011976b60 
cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+1f
 ()
ffffff0011976c40 
cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+8c ()
ffffff0011976e30 
cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+57f
 ()
ffffff0011976e70 cl_haci:__1cIcmm_implStransitions_thread6M_v_+b7 ()
ffffff0011976e80 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+9 ()
ffffff0011976ed0 cl_orb:cllwpwrapper+d7 ()
ffffff0011976ee0 unix:thread_start+8 ()

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
 51% done[2mMIdoOe

the host log is attched.

I have gone thru the SunCluster doc  on how to setup SunCluster for OpenSolaris 
multiple times, but I don?t see any steps that I miss.  Can you please help to 
see if this is setup issue or it is a bug?

Thanks,

Janey
-- 
This message posted from opensolaris.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xCid_message
Type: application/octet-stream
Size: 169719 bytes
Desc: not available
URL: 
<http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090904/cac9235f/attachment-0001.obj>

Reply via email to