Re: [ClusterLabs] Antw: Re: Trouble with drbd/pacemaker: switch to secondary/secondary

2016-10-21 Thread Anne Nicolas


Le 19/10/2016 à 08:53, Ulrich Windl a écrit :
 Ken Gaillot  schrieb am 18.10.2016 um 17:07 in 
 Nachricht
> <9d3b547c-6035-e41d-18ef-9950db01e...@redhat.com>:
>> On 10/14/2016 03:22 PM, Anne Nicolas wrote:
> 
> [...]
>>> cluster logs are flooded by :
>>> Oct 14 17:42:28 [3445] bzvairsvr  attrd:   notice:
>>> attrd_trigger_update:Sending flush op to all hosts for:
>>> master-drbdserv (1)
>>> Oct 14 17:42:28 [3445] bzvairsvr  attrd:   notice:
>>> attrd_perform_update:Sent update master-drbdserv=1 failed:
>>> Transport endpoint is not connected
>>
>> This is strange, and the cause of the problem. A master/slave resource
>> agent will try to set node attributes indicating which node should
>> become the master. Here, we see that this is failing -- it appears attrd
>> (Pacemaker's node attribute daemon) is unable to talk to any other daemons.
>>
>> I'm not sure why this would happen, especially if the rest of the
>> daemons do not have a problem talking to each other. But that's where
>> you need to investigate.
> 
> From my little experience it's a bad idea to route I/O traffic and cluster 
> communication over the same link: We had cases where cluster communication 
> (especially when using SCTP) showed errors when traffic was high. Maybe that 
> applies...
> 
>>
>> One thing I would say is that 1.1.8 is really old at this point, which
>> means you're using the "legacy" attrd, which I'm not very familiar with.
> 
> I agree: Even SLES11 SP4 uses old software, but it's at 
> "pacemaker-1.1.12-13.1" at least. Things _really_ got better with later 
> releases.
> 

I finally updated Pacemaker package ti the last version. Things are much
more reactive and all my  problems are gone. Thanks a lot for your
advice. Just need now to propose some backport packages to my
distribution :)

> 

-- 
Anne Nicolas
http://mageia.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Trouble with drbd/pacemaker: switch to secondary/secondary

2016-10-18 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 18.10.2016 um 17:07 in 
>>> Nachricht
<9d3b547c-6035-e41d-18ef-9950db01e...@redhat.com>:
> On 10/14/2016 03:22 PM, Anne Nicolas wrote:

[...]
>> cluster logs are flooded by :
>> Oct 14 17:42:28 [3445] bzvairsvr  attrd:   notice:
>> attrd_trigger_update:Sending flush op to all hosts for:
>> master-drbdserv (1)
>> Oct 14 17:42:28 [3445] bzvairsvr  attrd:   notice:
>> attrd_perform_update:Sent update master-drbdserv=1 failed:
>> Transport endpoint is not connected
> 
> This is strange, and the cause of the problem. A master/slave resource
> agent will try to set node attributes indicating which node should
> become the master. Here, we see that this is failing -- it appears attrd
> (Pacemaker's node attribute daemon) is unable to talk to any other daemons.
> 
> I'm not sure why this would happen, especially if the rest of the
> daemons do not have a problem talking to each other. But that's where
> you need to investigate.

>From my little experience it's a bad idea to route I/O traffic and cluster 
>communication over the same link: We had cases where cluster communication 
>(especially when using SCTP) showed errors when traffic was high. Maybe that 
>applies...

> 
> One thing I would say is that 1.1.8 is really old at this point, which
> means you're using the "legacy" attrd, which I'm not very familiar with.

I agree: Even SLES11 SP4 uses old software, but it's at "pacemaker-1.1.12-13.1" 
at least. Things _really_ got better with later releases.

[...]

Regards,
Ulrich




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org