[ClusterLabs] Problems with pacemaker

2017-11-17 Thread Salvatore D'Angelo
happened that when I start nodes in step 9 the slaves start as well. Is this possible? In which case this could happen? Salvatore D'Angelo Advisory Software Engineer - IBM Cloud IBM Rome Software Lab Via Sciangai 53, 00144 Roma Phone +39-347-432-8059 IBM Italia S.p.A. Sede Legale: Circonvallazione

[ClusterLabs] How to declare ping primitive with rule

2018-06-08 Thread Salvatore D'angelo
Hi All, I have a PostgreSQL cluster on three nodes (Master/Sync/Async) with WAL files stored on two GlusterFS nodes. In total, 5 machines. Let call the first three machines: pg1, pg2, pg3. The other two (without pacemaker): pgalog1, pgalog2. Now I this code works fine on some bare metal

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Salvatore D'angelo
at 17:41, Jehan-Guillaume de Rorthais wrote: > > On Tue, 29 May 2018 14:23:31 +0200 > Salvatore D'angelo wrote: > ... >> 2. I read some documentation about upgrade and since we want 0 ms downtime I >> think the Rolling Upgrade (node by node) is the better approac

Re: [ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-30 Thread Salvatore D'angelo
goes down? Thanks again for support. > On 30 May 2018, at 04:04, Ken Gaillot wrote: > > On Tue, 2018-05-29 at 22:25 +0200, Salvatore D'angelo wrote: >> Hi, >> >> Regarding last question about pacemaker dependencies for Ubuntu I >> found this for 1.1.18: >> htt

[ClusterLabs] Pacemaker PostgreSQL cluster

2018-05-29 Thread Salvatore D'angelo
Hi All, I am new to this list. I am working on a project that uses a cluster composed by 3 nodes (with Ubuntu 14.04 trusty) on which we run PostgreSQL managed as Master/slaves. We uses Pacemaker/Corosync to manage this cluster. In addition, we have a two node GlusterFS where we store backups

Re: [ClusterLabs] Upgrade corosync problem

2018-06-29 Thread Salvatore D'angelo
Good to know. I'll try it. I'll try to work on VM too. Il Ven 29 Giu 2018, 5:46 PM Jan Pokorný ha scritto: > On 26/06/18 11:03 +0200, Salvatore D'angelo wrote: > > Yes, sorry you’re right I could find it by myself. > > However, I did the following: > > > > 1. A

Re: [ClusterLabs] Antw: Re: Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
Hi again,I did another test. I modified docker container in order to be able to run strace.Running strace corosync-quorumtool -ps I got the following: corosync-quorumtool-strace.log Description: Binary data I tried to understand what happen behind the scene but it is not easy for me.Hoping

Re: [ClusterLabs] Upgrade corosync problem

2018-06-25 Thread Salvatore D'angelo
with previous corosync release?On 25 Jun 2018, at 09:09, Christine Caulfield <ccaul...@redhat.com> wrote:On 22/06/18 11:23, Salvatore D'angelo wrote:Hi,Here the log:[17323] pg1 corosyncerror   [QB    ] couldn't create circular mmap on/dev/shm/qb-cfg-event-17324-17334-23-data[17323] pg1 corosyn

Re: [ClusterLabs] Upgrade corosync problem

2018-06-25 Thread Salvatore D'angelo
required. I know I can find them on Google but if you can suggest me these info I’ll appreciate. I have OS knowledge to do that but I would like to avoid days of guesswork and try and error if possible. > On 25 Jun 2018, at 21:18, Jan Pokorný wrote: > > On 25/06/18 19:06 +0200,

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
:49, Christine Caulfield <ccaul...@redhat.com> wrote:On 26/06/18 09:40, Salvatore D'angelo wrote:Hi,Yes,I am reproducing only the required part for test. I think the originalsystem has a larger shm. The problem is that I do not know exactly howto change it.I tried the following steps, but

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
cluster like other two nodes (probably because pacemaker didn’t start correctly). This is the analysis I have done so far. Any suggestion? > On 26 Jun 2018, at 11:03, Salvatore D'angelo wrote: > > Yes, sorry you’re right I could find it by myself. > However, I did the following: >

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
have the impression that I changed the wrong parameter. Probably I have to change: shm 64M 11M 54M 16% /dev/shm but I do not know how to do that. Any suggestion? > On 26 Jun 2018, at 09:48, Christine Caulfield wrote: > > On 25/06/18 20:41, Salvatore D'angelo wro

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
ine Caulfield wrote: > > On 26/06/18 10:35, Salvatore D'angelo wrote: >> Sorry after the command: >> >> corosync-quorumtool -ps >> >> the error in log are still visible. Looking at the source code it seems >> problem is at this line: >> htt

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
corosync 2.3.5 and libqb 0.16.0 > On 26 Jun 2018, at 14:08, Christine Caulfield wrote: > > On 26/06/18 12:16, Salvatore D'angelo wrote: >> libqb update to 1.0.3 but same issue. >> >> I know corosync has also these dependencies nspr and nss3. I updated >>

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
e them. Am I correct? libcfg6 libcmap4 libcpg4 libquorum5 libsam4 libtotem-pg5 libvotequorum8 Can you tell me where these libraries come from and if I need them? > On 26 Jun 2018, at 14:08, Christine Caulfield wrote: > > On 26/06/18 12:16, Salvatore D'angelo wrote: >> libqb upd

Re: [ClusterLabs] Upgrade corosync problem

2018-06-27 Thread Salvatore D'angelo
nd cfg.c are almost the same. Probably the issue is somewhere else. > On 27 Jun 2018, at 08:34, Jan Pokorný wrote: > > On 26/06/18 17:56 +0200, Salvatore D'angelo wrote: >> I did another test. I modified docker container in order to be able to run >> strace. >> Runni

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
Hi, I have tried with: 0.16.0.real-1ubuntu4 0.16.0.real-1ubuntu5 which version should I try? > On 26 Jun 2018, at 12:03, Christine Caulfield wrote: > > On 26/06/18 11:00, Salvatore D'angelo wrote: >> Consider that the container is the same when corosync 2.3.5 run. >&g

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Salvatore D'angelo
ote: > > On 26/06/18 11:24, Salvatore D'angelo wrote: >> Hi, >> >> I have tried with: >> 0.16.0.real-1ubuntu4 >> 0.16.0.real-1ubuntu5 >> >> which version should I try? > > > Hmm both of those are actually quite old! maybe a newer one?

Re: [ClusterLabs] Upgrade corosync problem

2018-06-30 Thread Salvatore D'angelo
:00 +0100, Christine Caulfield wrote: >> On 27/06/18 08:35, Salvatore D'angelo wrote: >>> One thing that I do not understand is that I tried to compare corosync >>> 2.3.5 (the old version that worked fine) and 2.4.4 to understand >>> differences but I haven’t f

Re: [ClusterLabs] Upgrade corosync problem

2018-07-02 Thread Salvatore D'angelo
t;> On 27/06/18 08:35, Salvatore D'angelo wrote: >>>> One thing that I do not understand is that I tried to compare corosync >>>> 2.3.5 (the old version that worked fine) and 2.4.4 to understand >>>> differences but I haven’t found anything related to the piece of c

Re: [ClusterLabs] Resource agents differences from 1.1.14 and 1.1.18

2018-06-21 Thread Salvatore D'angelo
Hi, thanks for reply > On 21 Jun 2018, at 15:09, Jan Pokorný wrote: > > Hello Salvatore, > > On 21/06/18 12:44 +0200, Salvatore D'angelo wrote: >> I am trying to upgrade my PostgresSQL cluster managed by pacemaker >> to pacemaker 1.1.8 or 2.0.0. I have some resour

[ClusterLabs] Resource agents differences from 1.1.14 and 1.1.18

2018-06-21 Thread Salvatore D'angelo
Hi all, I am trying to upgrade my PostgresSQL cluster managed by pacemaker to pacemaker 1.1.8 or 2.0.0. I have some resource agents that I patched to have them working with my cluster. Can someone tell me if something is changed in the OCF interface from 1.1.14 release and the 1.1.8/2.0.0? I

[ClusterLabs] Upgrade corosync problem

2018-06-21 Thread Salvatore D'angelo
Hi, I upgraded my PostgreSQL/Pacemaker cluster with these versions. Pacemaker 1.1.14 -> 1.1.18 Corosync 2.3.5 -> 2.4.4 Crmsh 2.2.0 -> 3.0.1 Resource agents 3.9.7 -> 4.1.1 I started on a first node (I am trying one node at a time upgrade). On a PostgreSQL slave node I did: crm node standby

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Salvatore D'angelo
ring1_addr: pg3p nodeid: 3 } } logging { to_syslog: yes } > On 22 Jun 2018, at 09:24, Christine Caulfield wrote: > > On 21/06/18 16:16, Salvatore D'angelo wrote: >> Hi, >> >> I upgraded my PostgreSQL/Pacemaker cluster with these ve

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Salvatore D'angelo
if this is the right approach. Should I “make unistall" old versions before installing the new one? Which is the suggested approach? Thank in advance for your help. > On 22 Jun 2018, at 11:30, Christine Caulfield wrote: > > On 22/06/18 10:14, Salvatore D'angelo wrote: >> Hi Christine, >

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Salvatore D'angelo
Hi, Here the log: corosync.log Description: Binary data > On 22 Jun 2018, at 12:10, Christine Caulfield wrote: > > On 22/06/18 10:39, Salvatore D'angelo wrote: >> Hi, >> >> Can you tell me exactly which log you need. I’ll provide you as soon as >> possibl

Re: [ClusterLabs] crm --version shows "cam dev"

2018-07-05 Thread Salvatore D'angelo
and then installed this one. I am sure uninstall removed everything related to crash because I run a “find / -name crash*” on the container. > On 4 Jul 2018, at 21:32, Kristoffer Grönlund wrote: > > On Wed, 2018-07-04 at 17:52 +0200, Salvatore D'angelo wrote: >> Hi, >> >> With cras

[ClusterLabs] Found libqb issue that affects pacemaker 1.1.18

2018-07-05 Thread Salvatore D'angelo
Hi, I tried to build libqb 1.0.3 on a fresh machine and then corosync 2.4.4 and pacemaker 1.1.18. I found the following bug and filed against libqb GitHub: https://github.com/ClusterLabs/libqb/issues/312 for the moment I fixed it manually on my

[ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Salvatore D'angelo
Hi All, After I successfully upgraded Pacemaker from 1.1.14 to 1.1.18 and corosync from 2.3.35 to 2.4.4 on Ubuntu 14.04 I am trying to repeat the same scenario on Ubuntu 16.04. As my previous scenario I am using Docker for test purpose before move to Bare metal. The scenario worked properly

Re: [ClusterLabs] STONITH resources on wrong nodes

2018-07-11 Thread Salvatore D'angelo
; 2018-07-11 18:44 GMT+02:00 Salvatore D'angelo <mailto:sasadang...@gmail.com>>: > Hi all, > > in my cluster doing cam_mon -1ARrf I noticed my STONITH resources are not > correctly located: > p_ston_pg1(stonith:external/ipmi):Started pg2 > p_ston_pg2

Re: [ClusterLabs] STONITH resources on wrong nodes

2018-07-11 Thread Salvatore D'angelo
Does this mean that if STONITH resource p_ston_pg1 even if it runs on node pg2 if pacemaker send a signal to it pg1 is powered of and not pg2. Am I correct? > On 11 Jul 2018, at 19:10, Andrei Borzenkov wrote: > > 11.07.2018 19:44, Salvatore D'angelo пишет: >> Hi all, >> &

Re: [ClusterLabs] STONITH resources on wrong nodes

2018-07-11 Thread Salvatore D'angelo
Thank you. It's clear now. Il Mer 11 Lug 2018, 7:18 PM Andrei Borzenkov ha scritto: > 11.07.2018 20:12, Salvatore D'angelo пишет: > > Does this mean that if STONITH resource p_ston_pg1 even if it runs on > node pg2 if pacemaker send a signal to it pg1 is powered of and not pg2. >

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Salvatore D'angelo
distributions? If you look at corosync code the init/corosync file does not container run levels in header. So I suspect it is a code problem. Am I wrong? > On 11 Jul 2018, at 19:29, Ken Gaillot wrote: > > On Wed, 2018-07-11 at 18:43 +0200, Salvatore D'angelo wrote: >>

[ClusterLabs] STONITH resources on wrong nodes

2018-07-11 Thread Salvatore D'angelo
Hi all, in my cluster doing cam_mon -1ARrf I noticed my STONITH resources are not correctly located: p_ston_pg1 (stonith:external/ipmi):Started pg2 p_ston_pg2 (stonith:external/ipmi):Started pg1 p_ston_pg3 (stonith:external/ipmi):Started pg1 I have three

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Salvatore D'angelo
is one). when I run “make install” anything is created for systemd env. I am not a SysV vs System expert, hoping I haven’t said anything wrong. > On 11 Jul 2018, at 18:40, Andrei Borzenkov wrote: > > 11.07.2018 18:08, Salvatore D'angelo пишет: >> Hi All, >> >> Af

Re: [ClusterLabs] Upgrade corosync problem

2018-07-06 Thread Salvatore D'angelo
:143: logsys_qb_init: Assertion `"implicit callsite section is populated, otherwise target's build is at fault, preventing reliable logging" && __start___verbose != __stop___verbose' failed. ) = 207 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0cd4

Re: [ClusterLabs] Upgrade corosync problem

2018-07-06 Thread Salvatore D'angelo
one) and then corosync, when I follow this order it does not work. This is what make me crazy. I do not understand this behavior. > On 6 Jul 2018, at 14:40, Christine Caulfield wrote: > > On 06/07/18 13:24, Salvatore D'angelo wrote: >> Hi All, >> >> The option --ulim

Re: [ClusterLabs] Upgrade corosync problem

2018-07-06 Thread Salvatore D'angelo
sync/resourceagents upgrade it works fine. On 3 Jul 2018, at 11:42, Christine Caulfield wrote: > > On 03/07/18 07:53, Jan Pokorný wrote: >> On 02/07/18 17:19 +0200, Salvatore D'angelo wrote: >>> Today I tested the two suggestions you gave me. Here what I did. >>> In th

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Salvatore D'angelo
. For the moment the only fix I see is to manipulate these init.d scripts by myself hoping they will be fixed in pacemaker/corosync. > On 11 Jul 2018, at 23:18, Salvatore D'angelo wrote: > > Hi, > > I solved the issue (I am not sure to be honest) simply removing the > update

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-11 Thread Salvatore D'angelo
not know if pacemaker build creates automatically these services and then it is required extra work to make them available at book. > On 11 Jul 2018, at 21:07, Andrei Borzenkov wrote: > > 11.07.2018 21:01, Salvatore D'angelo пишет: >> Yes, but doing what you suggested the system

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-12 Thread Salvatore D'angelo
Hi, I have a cluster on three bare metal and I use two busters nodes to keep walking files and backup store on an object store. I use Docker for test purpose. Here the possible upgrade scenario you can apply:

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-12 Thread Salvatore D'angelo
t once, or perhaps > to start solving the problems from the too distant point :-) > > As mentioned, that's also fine, but let's separate them... > > On 11/07/18 18:43 +0200, Salvatore D'angelo wrote: >>>>> On Wed, 2018-07-11 at 18:43 +0200, Salvatore D'angelo w

[ClusterLabs] crm --version shows "cam dev"

2018-07-04 Thread Salvatore D'angelo
Hi, With crash 2.2.0 the command: cam —version works fine. I downloaded 3.0.1 and it shows: crm dev I know this is not a big issue but I just wanted to verify I installed the correct version of crash. ___ Users mailing list: Users@clusterlabs.org

[ClusterLabs] Install fresh pacemaker + corosync fails

2018-06-28 Thread Salvatore D'angelo
Hi All, I am here again. I am still fighting against upgrade problems but now I am trying to change the approach. I want now to try to install fresh a new version Corosync and Postgres to have it working. For the moment I am not interested to a specific configuration, just three nodes where I