Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster

Cao, Vinh Wed, 07 Jan 2015 18:58:02 -0800

Hi Digimer,

No we're not supporting multicast. I'm trying to use Broadcast, but Redhat 
support is saying better to use transport=udpu. Which I did set and it is 
complaining time out.
I did try to set broadcast, but somehow it didn't work either.


Let me give broadcast a try again.

Thanks,
Vinh

-----Original Message-----
From: linux-cluster-boun...@redhat.com 
[mailto:linux-cluster-boun...@redhat.com] On Behalf Of Digimer
Sent: Wednesday, January 07, 2015 5:51 PM
To: linux clustering
Subject: Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster

It looks like a network problem... Does your (virtual) switch support multicast 
properly and have you opened up the proper ports in the firewall?

On 07/01/15 05:32 PM, Cao, Vinh wrote:
> Hi Digimer,
>
> Yes, I just did. Looks like they are failing. I'm not sure why that is.
> Please see the attachment for all servers log.
>
> By the way, I do appreciated all the helps I can get.
>
> Vinh
>
> -----Original Message-----
> From: linux-cluster-boun...@redhat.com 
> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Digimer
> Sent: Wednesday, January 07, 2015 4:33 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster
>
> Quorum is enabled by default. I need to see the entire logs from all five 
> nodes, as I mentioned in the first email. Please disable cman from starting 
> on boot, configure fencing properly and then reboot all nodes cleanly. Start 
> the 'tail -f -n 0 /var/log/messages' on all five nodes, then in another 
> window, start cman on all five nodes. When things settle down, copy/paste all 
> the log output please.
>
> On 07/01/15 04:29 PM, Cao, Vinh wrote:
>> Hi Digimer,
>>
>> Here is from the logs:
>> [root@ustlvcmsp1954 ~]# tail -f /var/log/messages
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> loaded: corosync profile loading service
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [QUORUM] Using quorum 
>> provider quorum_cman
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> loaded: corosync cluster quorum service v0.1
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [MAIN  ] Compatibility mode 
>> set to whitetank.  Using V1 and V2 of the synchronization engine.
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [TOTEM ] A processor joined 
>> or left the membership and a new membership was formed.
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [QUORUM] Members[1]: 1
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [QUORUM] Members[1]: 1
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [CPG   ] chosen downlist: 
>> sender r(0) ip(10.30.197.108) ; members(old:0 left:0)
>> Jan  7 16:14:01 ustlvcmsp1954 corosync[8182]:   [MAIN  ] Completed service 
>> synchronization, ready to provide service.
>> Jan  7 16:14:01 ustlvcmsp1954 rgmanager[8099]: Waiting for quorum to form
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Unloading all 
>> Corosync service engines.
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync extended virtual synchrony service
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync configuration service
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync cluster closed process group service v1.01
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync cluster config database access v1.01
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync profile loading service
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: openais checkpoint service B.01.01
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync CMAN membership service 2.90
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [SERV  ] Service engine 
>> unloaded: corosync cluster quorum service v0.1
>> Jan  7 16:15:06 ustlvcmsp1954 corosync[8182]:   [MAIN  ] Corosync Cluster 
>> Engine exiting with status 0 at main.c:2055.
>> Jan  7 16:15:06 ustlvcmsp1954 rgmanager[8099]: Quorum formed
>>
>> Then it die at:
>>    Starting cman...                                        [  OK  ]
>>      Waiting for quorum... Timed-out waiting for cluster
>>                                                              [FAILED]
>>
>> Yes, I did made changes with: <fence_daemon post_join_delay="30"/> the 
>> problem is still there. One thing I don't know why cluster is looking for 
>> quorum?
>> I did have any disk quorum setup in cluster.conf file.
>>
>> Any helps can I get appreciated.
>>
>> Vinh
>>
>> -----Original Message-----
>> From: linux-cluster-boun...@redhat.com 
>> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Digimer
>> Sent: Wednesday, January 07, 2015 3:59 PM
>> To: linux clustering
>> Subject: Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster
>>
>> On 07/01/15 03:39 PM, Cao, Vinh wrote:
>>> Hello Digimer,
>>>
>>> Yes, I would agrre with you RHEL6.4 is old. We patched monthly, but I'm not 
>>> sure why these servers are still at 6.4. Most of our system are 6.6.
>>>
>>> Here is my cluster config. All I want is using cluster to have BGFS2 mount 
>>> via /etc/fstab.
>>> root@ustlvcmsp1955 ~]# cat /etc/cluster/cluster.conf <?xml 
>>> version="1.0"?> <cluster config_version="15" name="p1954_to_p1958">
>>>            <clusternodes>
>>>                    <clusternode name="ustlvcmsp1954" nodeid="1"/>
>>>                    <clusternode name="ustlvcmsp1955" nodeid="2"/>
>>>                    <clusternode name="ustlvcmsp1956" nodeid="3"/>
>>>                    <clusternode name="ustlvcmsp1957" nodeid="4"/>
>>>                    <clusternode name="ustlvcmsp1958" nodeid="5"/>
>>>            </clusternodes>
>>
>> You don't configure the fencing for the nodes... If anything causes a fence, 
>> the cluster will lock up (by design).
>>
>>>            <fencedevices>
>>>                    <fencedevice agent="fence_vmware_soap" 
>>> ipaddr="10.30.197.108" login="rhfence" name="p1954" passwd="xxxxxxxx"/>
>>>                    <fencedevice agent="fence_vmware_soap" 
>>> ipaddr="10.30.197.109" login="rhfence" name="p1955" passwd=" xxxxxxxx "/>
>>>                    <fencedevice agent="fence_vmware_soap" 
>>> ipaddr="10.30.197.110" login="rhfence" name="p1956" passwd=" xxxxxxxx "/>
>>>                    <fencedevice agent="fence_vmware_soap" 
>>> ipaddr="10.30.197.111" login="rhfence" name="p1957" passwd=" xxxxxxxx "/>
>>>                    <fencedevice agent="fence_vmware_soap" 
>>> ipaddr="10.30.197.112" login="rhfence" name="p1958" passwd=" xxxxxxxx "/>
>>>            </fencedevices>
>>> </cluster>
>>>
>>> clustat show:
>>>
>>> Cluster Status for p1954_to_p1958 @ Wed Jan  7 15:38:00 2015 Member
>>> Status: Quorate
>>>
>>>     Member Name                                                     ID   
>>> Status
>>>     ------ ----                                                     ---- 
>>> ------
>>>     ustlvcmsp1954                                                       1 
>>> Offline
>>>     ustlvcmsp1955                                                       2 
>>> Online, Local
>>>     ustlvcmsp1956                                                       3 
>>> Online
>>>     ustlvcmsp1957                                                       4 
>>> Offline
>>>     ustlvcmsp1958                                                       5 
>>> Online
>>>
>>> I need to make them all online, so I can use fencing for mounting shared 
>>> disk.
>>>
>>> Thanks,
>>> Vinh
>>
>> What about the log entries from the start-up? Did you try the 
>> post_join_delay config?
>>
>>
>>> -----Original Message-----
>>> From: linux-cluster-boun...@redhat.com 
>>> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Digimer
>>> Sent: Wednesday, January 07, 2015 3:16 PM
>>> To: linux clustering
>>> Subject: Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster
>>>
>>> My first though would be to set <fence_daemon post_join_delay="30" /> in 
>>> cluster.conf.
>>>
>>> If that doesn't work, please share your configuration file. Then, with all 
>>> nodes offline, open a terminal to each node and run 'tail -f -n 0 
>>> /var/log/messages'. With that running, start all the nodes and wait for 
>>> things to settle down, then paste the five nodes' output as well.
>>>
>>> Also, 6.4 is pretty old, why not upgrade to 6.6?
>>>
>>> digimer
>>>
>>> On 07/01/15 03:10 PM, Cao, Vinh wrote:
>>>> Hello Cluster guru,
>>>>
>>>> I'm trying to setup Redhat 6.4 OS cluster with 5 nodes. With two 
>>>> nodes I don't have any issue.
>>>>
>>>> But with 5 nodes, when I ran clustat I got 3 nodes online and the 
>>>> other two off line.
>>>>
>>>> When I start the one that are off line. Service cman start. I got:
>>>>
>>>> [root@ustlvcmspxxx ~]# service cman status
>>>>
>>>> corosync is stopped
>>>>
>>>> [root@ustlvcmsp1954 ~]# service cman start
>>>>
>>>> Starting cluster:
>>>>
>>>>        Checking if cluster has been disabled at boot...        [  OK  ]
>>>>
>>>>        Checking Network Manager...                             [  OK  ]
>>>>
>>>>        Global setup...                                         [  OK  ]
>>>>
>>>>        Loading kernel modules...                               [  OK  ]
>>>>
>>>>        Mounting configfs...                                    [  OK  ]
>>>>
>>>>        Starting cman...                                        [  OK  ]
>>>>
>>>> Waiting for quorum... Timed-out waiting for cluster
>>>>
>>>>
>>>> [FAILED]
>>>>
>>>> Stopping cluster:
>>>>
>>>>        Leaving fence domain...                                 [  OK  ]
>>>>
>>>>        Stopping gfs_controld...                                [  OK  ]
>>>>
>>>>        Stopping dlm_controld...                                [  OK  ]
>>>>
>>>>        Stopping fenced...                                      [  OK  ]
>>>>
>>>>        Stopping cman...                                        [  OK  ]
>>>>
>>>>        Waiting for corosync to shutdown:                       [  OK  ]
>>>>
>>>>        Unloading kernel modules...                             [  OK  ]
>>>>
>>>>        Unmounting configfs...                                  [  OK  ]
>>>>
>>>> Can you help?
>>>>
>>>> Thank you,
>>>>
>>>> Vinh
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Digimer
>>> Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is 
>>> trapped in the mind of a person without access to education?
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>
>>
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is 
> trapped in the mind of a person without access to education?
>
> --
> Linux-cluster mailing list
> Linux-cluster@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>


--
Digimer
Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is 
trapped in the mind of a person without access to education?

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] needs helps GFS2 on 5 nodes cluster

Reply via email to