libqb update to 1.0.3 but same issue. I know corosync has also these dependencies nspr and nss3. I updated them using apt-get install, here the version installed:
libnspr4, libnspr4-dev 2:4.13.1-0ubuntu0.14.04.1 libnss3, libnss3-dev, libnss3-nssb 2:3.28.4-0ubuntu0.14.04.3 but same problem. I am working on Ubuntu 14.04 image and I know that packages could be quite old here. Are there new versions for these libraries? Where I can download them? I tried to search on google but results where quite confusing. > On 26 Jun 2018, at 12:27, Christine Caulfield <ccaul...@redhat.com> wrote: > > On 26/06/18 11:24, Salvatore D'angelo wrote: >> Hi, >> >> I have tried with: >> 0.16.0.real-1ubuntu4 >> 0.16.0.real-1ubuntu5 >> >> which version should I try? > > > Hmm both of those are actually quite old! maybe a newer one? > > Chrissie > >> >>> On 26 Jun 2018, at 12:03, Christine Caulfield <ccaul...@redhat.com >>> <mailto:ccaul...@redhat.com>> wrote: >>> >>> On 26/06/18 11:00, Salvatore D'angelo wrote: >>>> Consider that the container is the same when corosync 2.3.5 run. >>>> If it is something related to the container probably the 2.4.4 >>>> introduced a feature that has an impact on container. >>>> Should be something related to libqb according to the code. >>>> Anyone can help? >>>> >>> >>> >>> Have you tried downgrading libqb to the previous version to see if it >>> still happens? >>> >>> Chrissie >>> >>>>> On 26 Jun 2018, at 11:56, Christine Caulfield <ccaul...@redhat.com >>>>> <mailto:ccaul...@redhat.com> >>>>> <mailto:ccaul...@redhat.com>> wrote: >>>>> >>>>> On 26/06/18 10:35, Salvatore D'angelo wrote: >>>>>> Sorry after the command: >>>>>> >>>>>> corosync-quorumtool -ps >>>>>> >>>>>> the error in log are still visible. Looking at the source code it seems >>>>>> problem is at this line: >>>>>> https://github.com/corosync/corosync/blob/master/tools/corosync-quorumtool.c >>>>>> >>>>>> if (quorum_initialize(&q_handle, &q_callbacks, &q_type) != CS_OK) { >>>>>> fprintf(stderr, "Cannot initialize QUORUM service\n"); >>>>>> q_handle = 0; >>>>>> goto out; >>>>>> } >>>>>> >>>>>> if (corosync_cfg_initialize(&c_handle, &c_callbacks) != CS_OK) { >>>>>> fprintf(stderr, "Cannot initialise CFG service\n"); >>>>>> c_handle = 0; >>>>>> goto out; >>>>>> } >>>>>> >>>>>> The quorum_initialize function is defined here: >>>>>> https://github.com/corosync/corosync/blob/master/lib/quorum.c >>>>>> >>>>>> It seems interacts with libqb to allocate space on /dev/shm but >>>>>> something fails. I tried to update the libqb with apt-get install >>>>>> but no >>>>>> success. >>>>>> >>>>>> The same for second function: >>>>>> https://github.com/corosync/corosync/blob/master/lib/cfg.c >>>>>> >>>>>> Now I am not an expert of libqb. I have the >>>>>> version 0.16.0.real-1ubuntu5. >>>>>> >>>>>> The folder /dev/shm has 777 permission like other nodes with older >>>>>> corosync and pacemaker that work fine. The only difference is that I >>>>>> only see files created by root, no one created by hacluster like other >>>>>> two nodes (probably because pacemaker didn’t start correctly). >>>>>> >>>>>> This is the analysis I have done so far. >>>>>> Any suggestion? >>>>>> >>>>>> >>>>> >>>>> Hmm. t seems very likely something to do with the way the container is >>>>> set up then - and I know nothing about containers. Sorry :/ >>>>> >>>>> Can anyone else help here? >>>>> >>>>> Chrissie >>>>> >>>>>>> On 26 Jun 2018, at 11:03, Salvatore D'angelo >>>>>>> <sasadang...@gmail.com <mailto:sasadang...@gmail.com> >>>>>>> <mailto:sasadang...@gmail.com> >>>>>>> <mailto:sasadang...@gmail.com>> wrote: >>>>>>> >>>>>>> Yes, sorry you’re right I could find it by myself. >>>>>>> However, I did the following: >>>>>>> >>>>>>> 1. Added the line you suggested to /etc/fstab >>>>>>> 2. mount -o remount /dev/shm >>>>>>> 3. Now I correctly see /dev/shm of 512M with df -h >>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>> overlay 63G 11G 49G 19% / >>>>>>> tmpfs 64M 4.0K 64M 1% /dev >>>>>>> tmpfs 1000M 0 1000M 0% /sys/fs/cgroup >>>>>>> osxfs 466G 158G 305G 35% /Users >>>>>>> /dev/sda1 63G 11G 49G 19% /etc/hosts >>>>>>> *shm 512M 15M 498M 3% /dev/shm* >>>>>>> tmpfs 1000M 0 1000M 0% /sys/firmware >>>>>>> tmpfs 128M 0 128M 0% /tmp >>>>>>> >>>>>>> The errors in log went away. Consider that I remove the log file >>>>>>> before start corosync so it does not contains lines of previous >>>>>>> executions. >>>>>>> <corosync.log> >>>>>>> >>>>>>> But the command: >>>>>>> corosync-quorumtool -ps >>>>>>> >>>>>>> still give: >>>>>>> Cannot initialize QUORUM service >>>>>>> >>>>>>> Consider that few minutes before it gave me the message: >>>>>>> Cannot initialize CFG service >>>>>>> >>>>>>> I do not know the differences between CFG and QUORUM in this case. >>>>>>> >>>>>>> If I try to start pacemaker the service is OK but I see only pacemaker >>>>>>> and the Transport does not work if I try to run a cam command. >>>>>>> Any suggestion? >>>>>>> >>>>>>> >>>>>>>> On 26 Jun 2018, at 10:49, Christine Caulfield >>>>>>>> <ccaul...@redhat.com <mailto:ccaul...@redhat.com> >>>>>>>> <mailto:ccaul...@redhat.com> >>>>>>>> <mailto:ccaul...@redhat.com>> wrote: >>>>>>>> >>>>>>>> On 26/06/18 09:40, Salvatore D'angelo wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Yes, >>>>>>>>> >>>>>>>>> I am reproducing only the required part for test. I think the >>>>>>>>> original >>>>>>>>> system has a larger shm. The problem is that I do not know >>>>>>>>> exactly how >>>>>>>>> to change it. >>>>>>>>> I tried the following steps, but I have the impression I didn’t >>>>>>>>> performed the right one: >>>>>>>>> >>>>>>>>> 1. remove everything under /tmp >>>>>>>>> 2. Added the following line to /etc/fstab >>>>>>>>> tmpfs /tmp tmpfs >>>>>>>>> defaults,nodev,nosuid,mode=1777,size=128M >>>>>>>>> 0 0 >>>>>>>>> 3. mount /tmp >>>>>>>>> 4. df -h >>>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>>> overlay 63G 11G 49G 19% / >>>>>>>>> tmpfs 64M 4.0K 64M 1% /dev >>>>>>>>> tmpfs 1000M 0 1000M 0% /sys/fs/cgroup >>>>>>>>> osxfs 466G 158G 305G 35% /Users >>>>>>>>> /dev/sda1 63G 11G 49G 19% /etc/hosts >>>>>>>>> shm 64M 11M 54M 16% /dev/shm >>>>>>>>> tmpfs 1000M 0 1000M 0% /sys/firmware >>>>>>>>> *tmpfs 128M 0 128M 0% /tmp* >>>>>>>>> >>>>>>>>> The errors are exactly the same. >>>>>>>>> I have the impression that I changed the wrong parameter. Probably I >>>>>>>>> have to change: >>>>>>>>> shm 64M 11M 54M 16% /dev/shm >>>>>>>>> >>>>>>>>> but I do not know how to do that. Any suggestion? >>>>>>>>> >>>>>>>> >>>>>>>> According to google, you just add a new line to /etc/fstab for >>>>>>>> /dev/shm >>>>>>>> >>>>>>>> tmpfs /dev/shm tmpfs defaults,size=512m 0 0 >>>>>>>> >>>>>>>> Chrissie >>>>>>>> >>>>>>>>>> On 26 Jun 2018, at 09:48, Christine Caulfield >>>>>>>>>> <ccaul...@redhat.com <mailto:ccaul...@redhat.com> >>>>>>>>>> <mailto:ccaul...@redhat.com> >>>>>>>>>> <mailto:ccaul...@redhat.com> >>>>>>>>>> <mailto:ccaul...@redhat.com>> wrote: >>>>>>>>>> >>>>>>>>>> On 25/06/18 20:41, Salvatore D'angelo wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Let me add here one important detail. I use Docker for my test >>>>>>>>>>> with 5 >>>>>>>>>>> containers deployed on my Mac. >>>>>>>>>>> Basically the team that worked on this project installed the >>>>>>>>>>> cluster >>>>>>>>>>> on soft layer bare metal. >>>>>>>>>>> The PostgreSQL cluster was hard to test and if a misconfiguration >>>>>>>>>>> occurred recreate the cluster from scratch is not easy. >>>>>>>>>>> Test it was a cumbersome if you consider that we access to the >>>>>>>>>>> machines with a complex system hard to describe here. >>>>>>>>>>> For this reason I ported the cluster on Docker for test purpose. >>>>>>>>>>> I am >>>>>>>>>>> not interested to have it working for months, I just need a >>>>>>>>>>> proof of >>>>>>>>>>> concept. >>>>>>>>>>> >>>>>>>>>>> When the migration works I’ll port everything on bare metal >>>>>>>>>>> where the >>>>>>>>>>> size of resources are ambundant. >>>>>>>>>>> >>>>>>>>>>> Now I have enough RAM and disk space on my Mac so if you tell me >>>>>>>>>>> what >>>>>>>>>>> should be an acceptable size for several days of running it is ok >>>>>>>>>>> for me. >>>>>>>>>>> It is ok also have commands to clean the shm when required. >>>>>>>>>>> I know I can find them on Google but if you can suggest me these >>>>>>>>>>> info >>>>>>>>>>> I’ll appreciate. I have OS knowledge to do that but I would >>>>>>>>>>> like to >>>>>>>>>>> avoid days of guesswork and try and error if possible. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I would recommend at least 128MB of space on /dev/shm, 256MB if >>>>>>>>>> you can >>>>>>>>>> spare it. My 'standard' system uses 75MB under normal running >>>>>>>>>> allowing >>>>>>>>>> for one command-line query to run. >>>>>>>>>> >>>>>>>>>> If I read this right then you're reproducing a bare-metal system in >>>>>>>>>> containers now? so the original systems will have a default >>>>>>>>>> /dev/shm >>>>>>>>>> size which is probably much larger than your containers? >>>>>>>>>> >>>>>>>>>> I'm just checking here that we don't have a regression in memory >>>>>>>>>> usage >>>>>>>>>> as Poki suggested. >>>>>>>>>> >>>>>>>>>> Chrissie >>>>>>>>>> >>>>>>>>>>>> On 25 Jun 2018, at 21:18, Jan Pokorný <jpoko...@redhat.com >>>>>>>>>>>> <mailto:jpoko...@redhat.com> >>>>>>>>>>>> <mailto:jpoko...@redhat.com> >>>>>>>>>>>> <mailto:jpoko...@redhat.com> >>>>>>>>>>>> <mailto:jpoko...@redhat.com>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 25/06/18 19:06 +0200, Salvatore D'angelo wrote: >>>>>>>>>>>>> Thanks for reply. I scratched my cluster and created it >>>>>>>>>>>>> again and >>>>>>>>>>>>> then migrated as before. This time I uninstalled pacemaker, >>>>>>>>>>>>> corosync, crmsh and resource agents with make uninstall >>>>>>>>>>>>> >>>>>>>>>>>>> then I installed new packages. The problem is the same, when >>>>>>>>>>>>> I launch: >>>>>>>>>>>>> corosync-quorumtool -ps >>>>>>>>>>>>> >>>>>>>>>>>>> I got: Cannot initialize QUORUM service >>>>>>>>>>>>> >>>>>>>>>>>>> Here the log with debug enabled: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> [18019] pg3 corosyncerror [QB ] couldn't create >>>>>>>>>>>>> circular mmap >>>>>>>>>>>>> on /dev/shm/qb-cfg-event-18020-18028-23-data >>>>>>>>>>>>> [18019] pg3 corosyncerror [QB ] >>>>>>>>>>>>> qb_rb_open:cfg-event-18020-18028-23: Resource temporarily >>>>>>>>>>>>> unavailable (11) >>>>>>>>>>>>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: >>>>>>>>>>>>> /dev/shm/qb-cfg-request-18020-18028-23-header >>>>>>>>>>>>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: >>>>>>>>>>>>> /dev/shm/qb-cfg-response-18020-18028-23-header >>>>>>>>>>>>> [18019] pg3 corosyncerror [QB ] shm connection FAILED: >>>>>>>>>>>>> Resource temporarily unavailable (11) >>>>>>>>>>>>> [18019] pg3 corosyncerror [QB ] Error in connection setup >>>>>>>>>>>>> (18020-18028-23): Resource temporarily unavailable (11) >>>>>>>>>>>>> >>>>>>>>>>>>> I tried to check /dev/shm and I am not sure these are the right >>>>>>>>>>>>> commands, however: >>>>>>>>>>>>> >>>>>>>>>>>>> df -h /dev/shm >>>>>>>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>>>>>>> shm 64M 16M 49M 24% /dev/shm >>>>>>>>>>>>> >>>>>>>>>>>>> ls /dev/shm >>>>>>>>>>>>> qb-cmap-request-18020-18036-25-data qb-corosync-blackbox-data >>>>>>>>>>>>> qb-quorum-request-18020-18095-32-data >>>>>>>>>>>>> qb-cmap-request-18020-18036-25-header >>>>>>>>>>>>> qb-corosync-blackbox-header >>>>>>>>>>>>> qb-quorum-request-18020-18095-32-header >>>>>>>>>>>>> >>>>>>>>>>>>> Is 64 Mb enough for /dev/shm. If no, why it worked with previous >>>>>>>>>>>>> corosync release? >>>>>>>>>>>> >>>>>>>>>>>> For a start, can you try configuring corosync with >>>>>>>>>>>> --enable-small-memory-footprint switch? >>>>>>>>>>>> >>>>>>>>>>>> Hard to say why the space provisioned to /dev/shm is the direct >>>>>>>>>>>> opposite of generous (per today's standards), but may be the >>>>>>>>>>>> result >>>>>>>>>>>> of automatic HW adaptation, and if RAM is so scarce in your case, >>>>>>>>>>>> the above build-time toggle might help. >>>>>>>>>>>> >>>>>>>>>>>> If not, then exponentially increasing size of /dev/shm space is >>>>>>>>>>>> likely your best bet (I don't recommended fiddling with >>>>>>>>>>>> mlockall() >>>>>>>>>>>> and similar measures in corosync). >>>>>>>>>>>> >>>>>>>>>>>> Of course, feel free to raise a regression if you have a >>>>>>>>>>>> reproducible >>>>>>>>>>>> comparison between two corosync (plus possibly different >>>>>>>>>>>> libraries >>>>>>>>>>>> like libqb) versions, one that works and one that won't, in >>>>>>>>>>>> reproducible conditions (like this small /dev/shm, VM image, >>>>>>>>>>>> etc.). >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Jan (Poki) >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>>>>>>> >>>>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>>>> <http://www.clusterlabs.org/> >>>>>>>>>>>> <http://www.clusterlabs.org/> >>>>>>>>>>>> Getting >>>>>>>>>>>> started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>>>>>>> <http://bugs.clusterlabs.org/> <http://bugs.clusterlabs.org/> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>>>>>> >>>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>>> <http://www.clusterlabs.org/> >>>>>>>>>>> <http://www.clusterlabs.org/> <http://www.clusterlabs.org/> >>>>>>>>>>> Getting >>>>>>>>>>> started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>>>>>>>>> <http://bugs.clusterlabs.org/> <http://bugs.clusterlabs.org/> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>>>>> >>>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>>> <http://www.clusterlabs.org/> >>>>>>>>>> <http://www.clusterlabs.org/> <http://www.clusterlabs.org/> >>>>>>>>>> Getting >>>>>>>>>> started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>>>>>>>> <http://bugs.clusterlabs.org/> <http://bugs.clusterlabs.org/> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>>>> >>>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>>> <http://www.clusterlabs.org/> >>>>>>>>> Getting >>>>>>>>> started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>> <mailto:Users@clusterlabs.org> >>>>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>>> >>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>> <http://www.clusterlabs.org/> >>>>>>>> Getting started: >>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list: Users@clusterlabs.org >>>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: >>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Users mailing list: Users@clusterlabs.org >>>>> <mailto:Users@clusterlabs.org> <mailto:Users@clusterlabs.org> >>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
_______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org