Re: [ClusterLabs] Upgrade corosync problem

Salvatore D'angelo Mon, 02 Jul 2018 08:20:02 -0700

Hi All,

Today I tested the two suggestions you gave me. Here what I did. 
In the script where I create my 5 machines cluster (I use three nodes for 
pacemaker PostgreSQL cluster and two nodes for glusterfs that we use for 
database backup and WAL files).

FIRST TEST
——————————
I added the —shm-size=512m to the “docker create” command. I noticed that as 
soon as I start it the shm size is 512m and I didn’t need to add the entry in 
/etc/fstab. However, I did it anyway:

tmpfs      /dev/shm      tmpfs   defaults,size=512m   0   0

and then
mount -o remount /dev/shm

Then I uninstalled all pieces of software (crmsh, resource agents, corosync and 
pacemaker) and installed the new one.
Started corosync and pacemaker but same problem occurred.

SECOND TEST
———————————
stopped corosync and pacemaker
uninstalled corosync
build corosync with --enable-small-memory-footprint and installed it
starte corosync and pacemaker

IT WORKED.

I would like to understand now why it didn’t worked in first test and why it 
worked in second. Which kind of memory is used too much here? /dev/shm seems 
not the problem, I allocated 512m on all three docker images (obviously on my 
single Mac) and enabled the container option as you suggested. Am I missing 
something here?

Now I want to use Docker for the moment only for test purpose so it could be ok 
to use the --enable-small-memory-footprint, but there is something I can do to 
have corosync working even without this option?

The reason I am asking this is that, in the future, it could be possible we 
deploy in production our cluster in containerised way (for the moment is just 
an idea). This will save a lot of time in developing, maintaining and deploying 
our patch system. All prerequisites and dependencies will be enclosed in 
container and if IT team will do some maintenance on bare metal (i.e. install 
new dependencies) it will not affects our containers. I do not see a lot of 
performance drawbacks in using container. The point is to understand if a 
containerised approach could save us lot of headache about maintenance of this 
cluster without affect performance too much. I am notice in Cloud environment 
this approach in a lot of contexts.

> On 2 Jul 2018, at 08:54, Christine Caulfield <ccaul...@redhat.com> wrote:
> 
> On 29/06/18 17:20, Jan Pokorný wrote:
>> On 29/06/18 10:00 +0100, Christine Caulfield wrote:
>>> On 27/06/18 08:35, Salvatore D'angelo wrote:
>>>> One thing that I do not understand is that I tried to compare corosync
>>>> 2.3.5 (the old version that worked fine) and 2.4.4 to understand
>>>> differences but I haven’t found anything related to the piece of code
>>>> that affects the issue. The quorum tool.c and cfg.c are almost the same.
>>>> Probably the issue is somewhere else.
>>>> 
>>> 
>>> This might be asking a bit much, but would it be possible to try this
>>> using Virtual Machines rather than Docker images? That would at least
>>> eliminate a lot of complex variables.
>> 
>> Salvatore, you can ignore the part below, try following the "--shm"
>> advice in other part of this thread.  Also the previous suggestion
>> to compile corosync with --small-memory-footprint may be of help,
>> but comes with other costs (expect lower throughput).
>> 
>> 
>> Chrissie, I have a plausible explanation and if it's true, then the
>> same will be reproduced wherever /dev/shm is small enough.
>> 
>> If I am right, then the offending commit is
>> https://github.com/corosync/corosync/commit/238e2e62d8b960e7c10bfa0a8281d78ec99f3a26
>> (present since 2.4.3), and while it arranges things for the better
>> in the context of prioritized, low jitter process, it all of
>> a sudden prevents as-you-need memory acquisition from the system,
>> meaning that the memory consumption constraints are checked immediately
>> when the memory is claimed (as it must fit into dedicated physical
>> memory in full).  Hence this impact we likely never realized may
>> be perceived as a sort of a regression.
>> 
>> Since we can calculate the approximate requirements statically, might
>> be worthy to add something like README.requirements, detailing how much
>> space will be occupied for typical configurations at minimum, e.g.:
>> 
>> - standard + --small-memory-footprint configuration
>> - 2 + 3 + X nodes (5?)
>> - without any service on top + teamed with qnetd + teamed with
>>  pacemaker atop (including just IPC channels between pacemaker
>>  daemons and corosync's CPG service, indeed)
>> 
> 
> That is possible explanation I suppose, yes.it <http://yes.it/>'s not 
> something we can
> sensibly revert because it was already fixing another regression!
> 
> 
> I like the idea of documenting the /dev/shm requrements - that would
> certainly help with other people using containers - Salvatore mentioned
> earlier that there was nothing to guide him about the size needed. I'll
> raise an issue in github to cover it. Your input on how to do it for
> containers would also be helpful.
> 
> Chrissie
> _______________________________________________
> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
> https://lists.clusterlabs.org/mailman/listinfo/users 
> <https://lists.clusterlabs.org/mailman/listinfo/users>
> 
> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Upgrade corosync problem

Reply via email to