Re: [Linux-cluster] daemon cpg_join error retrying

Andrew Beekhof Wed, 29 Oct 2014 15:24:02 -0700

> On 30 Oct 2014, at 9:06 am, Lax Kota (lkota) <[email protected]> wrote:
> 
>> I wonder if there is a mismatch between the cluster name in cluster.conf and 
>> the cluster name the GFS filesystem was created with.
> How to check  cluster name of GFS file system? I had similar configuration 
> running fine in multiple other setups with no such issue.


I don't really recall. Hopefully someone more familiar with GFS2 can chime in.

> 
> Also one more issue I am seeing in one other setup a repeated flood of 'A 
> processor joined or left the membership and a new membership was formed' 
> messages for every 4secs. I am running with default TOTEM settings with token 
> time out as 10 secs. Even after I increase the token, consensus values to be 
> higher. It goes on flooding the same message after newer consensus defined 
> time (eg: if I increase it to be 10secs, then I see new membership formed 
> messages for every 10secs)
> 
> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]:   [TOTEM ] A processor joined 
> or left the membership and a new membership was formed.
> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]:   [CPG   ] chosen downlist: 
> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]:   [MAIN  ] Completed service 
> synchronization, ready to provide service.
> 
> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]:   [TOTEM ] A processor joined 
> or left the membership and a new membership was formed.
> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]:   [CPG   ] chosen downlist: 
> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]:   [MAIN  ] Completed service 
> synchronization, ready to provide service.

It does not sound like your network is particularly healthy.
Are you using multicast or udpu? If multicast, it might be worth trying udpu

> 
> Thanks
> Lax
> 
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Andrew Beekhof
> Sent: Wednesday, October 29, 2014 2:42 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] daemon cpg_join error retrying
> 
> 
>> On 30 Oct 2014, at 8:38 am, Lax Kota (lkota) <[email protected]> wrote:
>> 
>> Hi All,
>> 
>> In one of my setup, I keep getting getting 'gfs_controld[10744]: daemon 
>> cpg_join error  retrying'. I have a 2 Node setup with pacemaker and corosync.
> 
> I wonder if there is a mismatch between the cluster name in cluster.conf and 
> the cluster name the GFS filesystem was created with.
> 
>> 
>> Even after I force kill the pacemaker processes and reboot the server and 
>> bring the pacemaker back up, it keeps giving cpg_join error. Is  there any 
>> way to fix this issue?  
>> 
>> 
>> Thanks
>> Lax
>> 
>> -- 
>> Linux-cluster mailing list
>> [email protected]
>> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> -- 
> Linux-cluster mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> -- 
> Linux-cluster mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] daemon cpg_join error retrying

Reply via email to