> This is just weird. What exact version of corosync are you running? Do you
> have latest Z stream?
I am running on Corosync 1.4.1 and pacemaker version is 1.1.8-7.el6
How should I get access to Z stream? Is there a specific dir I should pick this
z stream from?
Thanks
Lax
-----Original Message-----
From: linux-cluster-boun...@redhat.com
[mailto:linux-cluster-boun...@redhat.com] On Behalf Of Jan Friesse
Sent: Friday, October 31, 2014 9:43 AM
To: linux clustering
Subject: Re: [Linux-cluster] daemon cpg_join error retrying
Lax,
> Thanks Honza. Here is what I was doing,
>
>> usual reasons for this problem:
>> 1. mtu is too high and fragmented packets are not enabled (take a
>> look to netmtu configuration option)
> I am running with default mtu settings which is 1500. And I do see my
> interface(eth1) on the box does have MTU as 1500 too.
>
Keep in mind that if they are not directly connected, switch can throw packets
because of MTU.
>
> 2. config file on nodes are not in sync and one node may contain more node
> entries then other nodes (this may be also the case if you have two >
> clusters and one cluster contains entry of one node for other cluster) 3.
> firewall is asymmetrically blocked (so node can send but not receive). Also
> keep in mind that ports 5404 & 5405 may not be enough for udpu, because udpu
> uses one socket per remote node for sending.
> Verfiifed my config files cluster.conf and cib.xml and both have same
> no of node entries (2)
>
>> I would recommend to disable firewall completely (for testing) and if
>> everything will work, you just need to adjust firewall.
> I also ran tests with firewall off too on both the participating
> nodes, still see same issue
>
> In corosync log I see repeated set of these messages, hoping these will give
> some more pointers.
>
> Oct 29 22:11:02 corosync [SYNC ] Committing synchronization for
> (corosync cluster closed process group service v1.01) Oct 29 22:11:02
> corosync [MAIN ] Completed service synchronization, ready to provide service.
> Oct 29 22:11:02 corosync [TOTEM ] waiting_trans_ack changed to 0 Oct
> 29 22:11:03 corosync [TOTEM ] entering GATHER state from 11.
> Oct 29 22:11:03 corosync [TOTEM ] entering GATHER state from 10.
> Oct 29 22:11:05 corosync [TOTEM ] entering GATHER state from 0.
This is just weird. What exact version of corosync are you running? Do you have
latest Z stream?
Regards,
Honza
> Oct 29 22:11:05 corosync [TOTEM ] got commit token Oct 29 22:11:05
> corosync [TOTEM ] Saving state aru 1b high seq received 1b Oct 29
> 22:11:05 corosync [TOTEM ] Storing new sequence id for ring 51708 Oct
> 29 22:11:05 corosync [TOTEM ] entering COMMIT state.
> Oct 29 22:11:05 corosync [TOTEM ] got commit token Oct 29 22:11:05
> corosync [TOTEM ] entering RECOVERY state.
> Oct 29 22:11:05 corosync [TOTEM ] TRANS [0] member 172.28.0.64:
> Oct 29 22:11:05 corosync [TOTEM ] TRANS [1] member 172.28.0.65:
> Oct 29 22:11:05 corosync [TOTEM ] position [0] member 172.28.0.64:
> Oct 29 22:11:05 corosync [TOTEM ] previous ring seq 333572 rep
> 172.28.0.64 Oct 29 22:11:05 corosync [TOTEM ] aru 1b high delivered 1b
> received flag 1 Oct 29 22:11:05 corosync [TOTEM ] position [1] member
> 172.28.0.65:
> Oct 29 22:11:05 corosync [TOTEM ] previous ring seq 333572 rep
> 172.28.0.64 Oct 29 22:11:05 corosync [TOTEM ] aru 1b high delivered 1b
> received flag 1 Oct 29 22:11:05 corosync [TOTEM ] Did not need to originate
> any messages in recovery.
> Oct 29 22:11:05 corosync [TOTEM ] token retrans flag is 0 my set
> retrans flag0 retrans queue empty 1 count 0, aru ffffffff Oct 29
> 22:11:05 corosync [TOTEM ] install seq 0 aru 0 high seq received 0 Oct
> 29 22:11:05 corosync [TOTEM ] token retrans flag is 0 my set retrans
> flag0 retrans queue empty 1 count 1, aru 0 Oct 29 22:11:05 corosync
> [TOTEM ] install seq 0 aru 0 high seq received 0 Oct 29 22:11:05
> corosync [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans
> queue empty 1 count 2, aru 0 Oct 29 22:11:05 corosync [TOTEM ] install
> seq 0 aru 0 high seq received 0 Oct 29 22:11:05 corosync [TOTEM ]
> token retrans flag is 0 my set retrans flag0 retrans queue empty 1
> count 3, aru 0 Oct 29 22:11:05 corosync [TOTEM ] install seq 0 aru 0
> high seq received 0 Oct 29 22:11:05 corosync [TOTEM ] retrans flag
> count 4 token aru 0 install seq 0 aru 0 0 Oct 29 22:11:05 corosync
> [TOTEM ] Resetting old ring state Oct 29 22:11:05 corosync [TOTEM ]
> recovery to regular 1-0 Oct 29 22:11:05 corosync [CMAN ] ais:
> confchg_fn called type = 1, seq=333576 Oct 29 22:11:05 corosync [TOTEM
> ] waiting_trans_ack changed to 1 Oct 29 22:11:05 corosync [CMAN ]
> ais: confchg_fn called type = 0, seq=333576 Oct 29 22:11:05 corosync
> [CMAN ] ais: last memb_count = 2, current = 2 Oct 29 22:11:05
> corosync [CMAN ] memb: sending TRANSITION message. cluster_name =
> vsomcluster Oct 29 22:11:05 corosync [CMAN ] ais: comms send message
> 0x7fff8185ca00 len = 65 Oct 29 22:11:05 corosync [CMAN ] daemon: sending
> reply 103 to fd 24 Oct 29 22:11:05 corosync [CMAN ] daemon: sending reply
> 103 to fd 34 Oct 29 22:11:05 corosync [SYNC ] This node is within the
> primary component and will provide service.
> Oct 29 22:11:05 corosync [TOTEM ] entering OPERATIONAL state.
> Oct 29 22:11:05 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
> Oct 29 22:11:05 corosync [CMAN ] ais: deliver_fn source nodeid = 2,
> len=81, endian_conv=0 Oct 29 22:11:05 corosync [CMAN ] memb: Message
> on port 0 is 5 Oct 29 22:11:05 corosync [CMAN ] memb: got TRANSITION
> from node 2 Oct 29 22:11:05 corosync [CMAN ] memb: Got TRANSITION
> message. msg->flags=20, node->flags=20, first_trans=0 Oct 29 22:11:05
> corosync [CMAN ] memb: add_ais_node ID=2, incarnation = 333576 Oct 29
> 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05 corosync
> [SYNC ] Barrier Start Received From 2 Oct 29 22:11:05 corosync [SYNC ]
> Barrier completion status for nodeid 1 = 0.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [CMAN ] ais: deliver_fn source nodeid = 1,
> len=81, endian_conv=0 Oct 29 22:11:05 corosync [CMAN ] memb: Message
> on port 0 is 5 Oct 29 22:11:05 corosync [CMAN ] memb: got TRANSITION
> from node 1 Oct 29 22:11:05 corosync [CMAN ] Completed first
> transition with nodes on the same config versions Oct 29 22:11:05
> corosync [CMAN ] memb: Got TRANSITION message. msg->flags=20,
> node->flags=20, first_trans=0 Oct 29 22:11:05 corosync [CMAN ] memb:
> add_ais_node ID=1, incarnation = 333576 Oct 29 22:11:05 corosync [SYNC
> ] confchg entries 2 Oct 29 22:11:05 corosync [SYNC ] Barrier Start Received
> From 1 Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid
> 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [SYNC ] Synchronization actions starting for
> (dummy CLM service) Oct 29 22:11:05 corosync [SYNC ] confchg entries
> 2 Oct 29 22:11:05 corosync [SYNC ] Barrier Start Received From 1 Oct
> 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 0.
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [SYNC ] Committing synchronization for
> (dummy CLM service) Oct 29 22:11:05 corosync [SYNC ] Synchronization
> actions starting for (dummy AMF service) Oct 29 22:11:05 corosync
> [SYNC ] confchg entries 2 Oct 29 22:11:05 corosync [SYNC ] Barrier
> Start Received From 2 Oct 29 22:11:05 corosync [SYNC ] Barrier completion
> status for nodeid 1 = 0.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 1 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [SYNC ] Committing synchronization for
> (dummy AMF service) Oct 29 22:11:05 corosync [SYNC ] Synchronization
> actions starting for (openais checkpoint service B.01.01) Oct 29
> 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05 corosync
> [SYNC ] confchg entries 2 Oct 29 22:11:05 corosync [SYNC ] Barrier
> Start Received From 1 Oct 29 22:11:05 corosync [SYNC ] Barrier completion
> status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 0.
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [SYNC ] Committing synchronization for
> (openais checkpoint service B.01.01) Oct 29 22:11:05 corosync [SYNC ]
> Synchronization actions starting for (dummy EVT service) Oct 29
> 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05 corosync
> [SYNC ] Barrier Start Received From 2 Oct 29 22:11:05 corosync [SYNC ]
> Barrier completion status for nodeid 1 = 0.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 1 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [SYNC ] Committing synchronization for
> (dummy EVT service) Oct 29 22:11:05 corosync [SYNC ] Synchronization actions
> starting for (corosync cluster closed process group service v1.01)
> Oct 29 22:11:05 corosync [CPG ] got joinlist message from node 1
> Oct 29 22:11:05 corosync [CPG ] comparing: sender r(0) ip(172.28.0.65) ;
> members(old:2 left:0)
> Oct 29 22:11:05 corosync [CPG ] comparing: sender r(0) ip(172.28.0.64) ;
> members(old:2 left:0)
> Oct 29 22:11:05 corosync [CPG ] chosen downlist: sender r(0)
> ip(172.28.0.64) ; members(old:2 left:0)
> Oct 29 22:11:05 corosync [CPG ] got joinlist message from node 2
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 1 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 0.
> Oct 29 22:11:05 corosync [SYNC ] confchg entries 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier Start Received From 2 Oct 29 22:11:05
> corosync [SYNC ] Barrier completion status for nodeid 1 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Barrier completion status for nodeid 2 = 1.
> Oct 29 22:11:05 corosync [SYNC ] Synchronization barrier completed
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[0] group:crmd\x00,
> ip:r(0) ip(172.28.0.65) , pid:9198
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[1] group:attrd\x00,
> ip:r(0) ip(172.28.0.65) , pid:9196
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[2] group:stonith-ng\x00,
> ip:r(0) ip(172.28.0.65) , pid:9194
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[3] group:cib\x00, ip:r(0)
> ip(172.28.0.65) , pid:9193
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[4] group:pcmk\x00,
> ip:r(0) ip(172.28.0.65) , pid:9187
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[5]
> group:gfs:controld\x00, ip:r(0) ip(172.28.0.65) , pid:9111
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[6]
> group:dlm:controld\x00, ip:r(0) ip(172.28.0.65) , pid:9057
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[7]
> group:fenced:default\x00, ip:r(0) ip(172.28.0.65) , pid:9040
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[8]
> group:fenced:daemon\x00, ip:r(0) ip(172.28.0.65) , pid:9040
> Oct 29 22:11:05 corosync [CPG ] joinlist_messages[9] group:crmd\x00,
> ip:r(0) ip(172.28.0.64) , pid:14530
> Oct 29 22:11:05 corosync [SYNC ] Committing synchronization for
> (corosync cluster closed process group service v1.01) Oct 29 22:11:05
> corosync [MAIN ] Completed service synchronization, ready to provide service.
>
> Thanks
> Lax
>
>
> -----Original Message-----
> From: linux-cluster-boun...@redhat.com
> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Jan Friesse
> Sent: Thursday, October 30, 2014 1:23 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] daemon cpg_join error retrying
>
>>
>>> On 30 Oct 2014, at 9:32 am, Lax Kota (lkota) <lk...@cisco.com> wrote:
>>>
>>>
>>>>> I wonder if there is a mismatch between the cluster name in cluster.conf
>>>>> and the cluster name the GFS filesystem was created with.
>>>>> How to check cluster name of GFS file system? I had similar
>>>>> configuration running fine in multiple other setups with no such issue.
>>>
>>>> I don't really recall. Hopefully someone more familiar with GFS2 can chime
>>>> in.
>>> Ok.
>>>
>>>>>
>>>>> Also one more issue I am seeing in one other setup a repeated
>>>>> flood of 'A processor joined or left the membership and a new
>>>>> membership was formed' messages for every 4secs. I am running with
>>>>> default TOTEM settings with token time out as 10 secs. Even after
>>>>> I increase the token, consensus values to be higher. It goes on
>>>>> flooding the same message after newer consensus defined time (eg:
>>>>> if I increase it to be 10secs, then I see new membership formed
>>>>> messages for every 10secs)
>>>>>
>>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [TOTEM ] A processor
>>>>> joined or left the membership and a new membership was formed.
>>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [CPG ] chosen downlist:
>>>>> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
>>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [MAIN ] Completed
>>>>> service synchronization, ready to provide service.
>>>>>
>>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [TOTEM ] A processor
>>>>> joined or left the membership and a new membership was formed.
>>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [CPG ] chosen downlist:
>>>>> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
>>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [MAIN ] Completed
>>>>> service synchronization, ready to provide service.
>>>
>>>> It does not sound like your network is particularly healthy.
>>>> Are you using multicast or udpu? If multicast, it might be worth
>>>> trying udpu
>>>
>>> I am using udpu and I also have firewall opened for ports 5404 & 5405.
>>> Tcpdump looks fine too, it does not complain of any issues. This is a VM
>>> envirornment and even if I switch to other node within same VM I keep
>>> getting same failure.
>>
>> Depending on what the host and VMs are doing, that might be your problem.
>> In any case, I will defer to the corosync guys at this point.
>>
>
> Lax,
> usual reasons for this problem:
> 1. mtu is too high and fragmented packets are not enabled (take a look to
> netmtu configuration option) 2. config file on nodes are not in sync and one
> node may contain more node entries then other nodes (this may be also the
> case if you have two clusters and one cluster contains entry of one node for
> other cluster) 3. firewall is asymmetrically blocked (so node can send but
> not receive). Also keep in mind that ports 5404 & 5405 may not be enough for
> udpu, because udpu uses one socket per remote node for sending.
>
> I would recommend to disable firewall completely (for testing) and if
> everything will work, you just need to adjust firewall.
>
> Regards,
> Honza
>
>
>
>>>
>>> Thanks
>>> Lax
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: linux-cluster-boun...@redhat.com
>>> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Andrew
>>> Beekhof
>>> Sent: Wednesday, October 29, 2014 3:17 PM
>>> To: linux clustering
>>> Subject: Re: [Linux-cluster] daemon cpg_join error retrying
>>>
>>>
>>>> On 30 Oct 2014, at 9:06 am, Lax Kota (lkota) <lk...@cisco.com> wrote:
>>>>
>>>>> I wonder if there is a mismatch between the cluster name in cluster.conf
>>>>> and the cluster name the GFS filesystem was created with.
>>>> How to check cluster name of GFS file system? I had similar configuration
>>>> running fine in multiple other setups with no such issue.
>>>
>>> I don't really recall. Hopefully someone more familiar with GFS2 can chime
>>> in.
>>>
>>>>
>>>> Also one more issue I am seeing in one other setup a repeated flood
>>>> of 'A processor joined or left the membership and a new membership
>>>> was formed' messages for every 4secs. I am running with default
>>>> TOTEM settings with token time out as 10 secs. Even after I
>>>> increase the token, consensus values to be higher. It goes on
>>>> flooding the same message after newer consensus defined time (eg:
>>>> if I increase it to be 10secs, then I see new membership formed
>>>> messages for every
>>>> 10secs)
>>>>
>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [TOTEM ] A processor
>>>> joined or left the membership and a new membership was formed.
>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [CPG ] chosen downlist:
>>>> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
>>>> Oct 29 14:58:10 VSM76-VSOM64 corosync[28388]: [MAIN ] Completed service
>>>> synchronization, ready to provide service.
>>>>
>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [TOTEM ] A processor
>>>> joined or left the membership and a new membership was formed.
>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [CPG ] chosen downlist:
>>>> sender r(0) ip(172.28.0.64) ; members(old:2 left:0)
>>>> Oct 29 14:58:14 VSM76-VSOM64 corosync[28388]: [MAIN ] Completed service
>>>> synchronization, ready to provide service.
>>>
>>> It does not sound like your network is particularly healthy.
>>> Are you using multicast or udpu? If multicast, it might be worth
>>> trying udpu
>>>
>>>>
>>>> Thanks
>>>> Lax
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: linux-cluster-boun...@redhat.com
>>>> [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Andrew
>>>> Beekhof
>>>> Sent: Wednesday, October 29, 2014 2:42 PM
>>>> To: linux clustering
>>>> Subject: Re: [Linux-cluster] daemon cpg_join error retrying
>>>>
>>>>
>>>>> On 30 Oct 2014, at 8:38 am, Lax Kota (lkota) <lk...@cisco.com> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> In one of my setup, I keep getting getting 'gfs_controld[10744]: daemon
>>>>> cpg_join error retrying'. I have a 2 Node setup with pacemaker and
>>>>> corosync.
>>>>
>>>> I wonder if there is a mismatch between the cluster name in cluster.conf
>>>> and the cluster name the GFS filesystem was created with.
>>>>
>>>>>
>>>>> Even after I force kill the pacemaker processes and reboot the server and
>>>>> bring the pacemaker back up, it keeps giving cpg_join error. Is there
>>>>> any way to fix this issue?
>>>>>
>>>>>
>>>>> Thanks
>>>>> Lax
>>>>>
>>>>> --
>>>>> Linux-cluster mailing list
>>>>> Linux-cluster@redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster