date:20151103

[ClusterLabs] Pacemaker build error

2015-11-03 Thread Jim Van Oosten



I am getting a compile error when building Pacemaker on Linux version
2.6.32-431.el6.x86_64.

The build commands:

git clone git://github.com/ClusterLabs/pacemaker.git
cd pacemaker
./autogen.sh && ./configure --prefix=/usr --sysconfdir=/etc
make
make install

The compile error:

Making install in services
gmake[2]: Entering directory
`/tmp/software/HA_linux/pacemaker/lib/services'
  CC   libcrmservice_la-services.lo
services.c: In function 'resources_action_create':
services.c:153: error: 'svc_action_private_t' has no member named 'pending'
services.c: In function 'services_action_create_generic':
services.c:340: error: 'svc_action_private_t' has no member named 'pending'
gmake[2]: *** [libcrmservice_la-services.lo] Error 1
gmake[2]: Leaving directory `/tmp/software/HA_linux/pacemaker/lib/services'
gmake[1]: *** [install-recursive] Error 1
gmake[1]: Leaving directory `/tmp/software/HA_linux/pacemaker/lib'
make: *** [install-recursive] Error 1


The pending field that services.c is attenpting to set is conditioned on
the SUPPORT_DBUS flag in services_private.h.

pacemaker/lib/services/services_private.h

   
 #if SUPPORT_DBUS  
   
   
   





 DBusPendingCall*   
 pending;   





 unsigned timerid;  






 #endif 





Am I building Pacemaker incorrectly or should I open an defect for this
problem?

Jim VanOosten
jimvo at  us.ibm.com
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Ken Gaillot

On 11/03/2015 01:40 PM, Nuno Pereira wrote:
>> -Mensagem original-
>> De: Ken Gaillot [mailto:kgail...@redhat.com]
>> Enviada: terça-feira, 3 de Novembro de 2015 18:02
>> Para: Nuno Pereira; 'Cluster Labs - All topics related to open-source
> clustering
>> welcomed'
>> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster
>>
>> On 11/03/2015 05:38 AM, Nuno Pereira wrote:
 -Mensagem original-
 De: Ken Gaillot [mailto:kgail...@redhat.com]
 Enviada: segunda-feira, 2 de Novembro de 2015 19:53
 Para: users@clusterlabs.org
 Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

 On 11/02/2015 01:24 PM, Nuno Pereira wrote:
> Hi all.
>
>
>
> We have one cluster that has 9 nodes and 20 resources.
>
>
>
> Four of those hosts are PSIP-SRV01-active, PSIP-SRV01-passive,
> PSIP-SRV02-active and PSIP-SRV02-active.
>
> They should provide an lsb:opensips service, 2 by 2:
>
> . The SRV01-opensips and SRV01-IP resources should be active on
>>> one of
> PSIP-SRV01-active or PSIP-SRV01-passive;
>
> . The SRV02-opensips and SRV02-IP resources should be active on
>>> one of
> PSIP-SRV02-active or PSIP-SRV02-passive.
>
>
>
>
> Everything works fine, until the moment that one of those nodes is
 rebooted.
> In the last case the problem occurred with a reboot of
> PSIP-SRV01-passive,
> that wasn't providing the service at that moment.
>
>
>
> To be noted that all opensips nodes had the opensips service to be
> started
>>> on
> boot by initd, which was removed in the meanwhile.
>
> The problem is that the service SRV01-opensips is detected to be started
>>> on
> both PSIP-SRV01-active and PSIP-SRV01-passive, and the SRV02-opensips
>> is
> detected to be started on both PSIP-SRV01-active and PSIP-SRV02-active.
>
> After that and several operations done by the cluster, which include
>>> actions
> to stop both SRV01-opensips on both PSIP-SRV01-active and PSIP-SRV01-
 passive,
> and to stop SRV02-opensips on PSIP-SRV01-active and PSIP-SRV02-active,
 which
> fail on PSIP-SRV01-passive, the resource SRV01-opensips becomes
 unmanaged.
>
>
>
> Any ideas on how to fix this?
>
> Nuno Pereira
>
> G9Telecom

 Your configuration looks appropriate, so it sounds like something is
 still starting the opensips services outside cluster control. Pacemaker
 recovers from multiple running instances by stopping them all, then
 starting on the expected node.
>>> Yesterday I removed the pacemaker from starting on boot, and
>>> tested it: the problem persists.
>>> Also, I checked the logs and the opensips wasn't started on the
>>> PSIP-SRV01-passive machine, the one that was rebooted.
>>> Is it possible to change that behaviour, as it is undesirable for our
>>> environment?
>>> For example, only to stop it on one of the hosts.
>>>
 You can verify that Pacemaker did not start the extra instances by
 looking for start messages in the logs (they will look like "Operation
 SRV01-opensips_start_0" etc.).
>>> On the rebooted node I don't see 2 starts, but only 2 failed stops, the
> first
>>> failed for the service that wasn't supposed to run there, and a normal one
> for
>>> the service that was supposed to run there:
>>>
>>> Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:error:
>>> process_lrm_event:  Operation SRV02-opensips_stop_0 (node=PSIP-
>>> SRV01-passive, call=52, status=4, cib-update=23, confirmed=true) Error
>>> Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:   notice:
>>> process_lrm_event:  Operation SRV01-opensips_stop_0: ok (node=PSIP-
>>> SRV01-passive, call=51, rc=0, cib-update=24, confirmed=true)
>>>
>>>
 The other question is why did the stop command fail. The logs should
 shed some light on that too; look for the equivalent "_stop_0" operation
 and the messages around it. The resource agent might have reported an
 error, or it might have timed out.
>>> I see this:
>>>
>>> Nov 02 23:01:24 [1689] PSIP-SRV01-passive   lrmd:  warning:
>>> operation_finished: SRV02-opensips_stop_0:1983 - terminated with
> signal
>> 15
>>> Nov 02 23:01:24 [1689] PSIP-BBT01-passive   lrmd: info:
> log_finished:
>>> finished - rsc: SRV02-opensips action:stop call_id:52 pid:1983 exit-code:1
>>> exec-time:79ms queue-time:0ms
>>>
>>> As it can be seen above, the call_id for the failed stop is greater that
> the
>>> one with success, but ends before.
>>> Also, as both operations are stopping the exact same service, the last one
>>> fails. And on the case of the one that fails, it wasn't supposed to be
> stopped
>>> or started in that host, as was configured.
>>
>> I think I see what's happening. I overlooked that SRV01-opensips and
>> SRV02-opensips are using the same LSB init script. That means P

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Nuno Pereira

> -Mensagem original-
> De: Ken Gaillot [mailto:kgail...@redhat.com]
> Enviada: terça-feira, 3 de Novembro de 2015 18:02
> Para: Nuno Pereira; 'Cluster Labs - All topics related to open-source
clustering
> welcomed'
> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster
> 
> On 11/03/2015 05:38 AM, Nuno Pereira wrote:
> >> -Mensagem original-
> >> De: Ken Gaillot [mailto:kgail...@redhat.com]
> >> Enviada: segunda-feira, 2 de Novembro de 2015 19:53
> >> Para: users@clusterlabs.org
> >> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster
> >>
> >> On 11/02/2015 01:24 PM, Nuno Pereira wrote:
> >>> Hi all.
> >>>
> >>>
> >>>
> >>> We have one cluster that has 9 nodes and 20 resources.
> >>>
> >>>
> >>>
> >>> Four of those hosts are PSIP-SRV01-active, PSIP-SRV01-passive,
> >>> PSIP-SRV02-active and PSIP-SRV02-active.
> >>>
> >>> They should provide an lsb:opensips service, 2 by 2:
> >>>
> >>> . The SRV01-opensips and SRV01-IP resources should be active on
> > one of
> >>> PSIP-SRV01-active or PSIP-SRV01-passive;
> >>>
> >>> . The SRV02-opensips and SRV02-IP resources should be active on
> > one of
> >>> PSIP-SRV02-active or PSIP-SRV02-passive.
> >>>
> >>>
> >>>
> >>>
> >>> Everything works fine, until the moment that one of those nodes is
> >> rebooted.
> >>> In the last case the problem occurred with a reboot of
PSIP-SRV01-passive,
> >>> that wasn't providing the service at that moment.
> >>>
> >>>
> >>>
> >>> To be noted that all opensips nodes had the opensips service to be
started
> > on
> >>> boot by initd, which was removed in the meanwhile.
> >>>
> >>> The problem is that the service SRV01-opensips is detected to be started
> > on
> >>> both PSIP-SRV01-active and PSIP-SRV01-passive, and the SRV02-opensips
> is
> >>> detected to be started on both PSIP-SRV01-active and PSIP-SRV02-active.
> >>>
> >>> After that and several operations done by the cluster, which include
> > actions
> >>> to stop both SRV01-opensips on both PSIP-SRV01-active and PSIP-SRV01-
> >> passive,
> >>> and to stop SRV02-opensips on PSIP-SRV01-active and PSIP-SRV02-active,
> >> which
> >>> fail on PSIP-SRV01-passive, the resource SRV01-opensips becomes
> >> unmanaged.
> >>>
> >>>
> >>>
> >>> Any ideas on how to fix this?
> >>>
> >>> Nuno Pereira
> >>>
> >>> G9Telecom
> >>
> >> Your configuration looks appropriate, so it sounds like something is
> >> still starting the opensips services outside cluster control. Pacemaker
> >> recovers from multiple running instances by stopping them all, then
> >> starting on the expected node.
> > Yesterday I removed the pacemaker from starting on boot, and
> > tested it: the problem persists.
> > Also, I checked the logs and the opensips wasn't started on the
> > PSIP-SRV01-passive machine, the one that was rebooted.
> > Is it possible to change that behaviour, as it is undesirable for our
> > environment?
> > For example, only to stop it on one of the hosts.
> >
> >> You can verify that Pacemaker did not start the extra instances by
> >> looking for start messages in the logs (they will look like "Operation
> >> SRV01-opensips_start_0" etc.).
> > On the rebooted node I don't see 2 starts, but only 2 failed stops, the
first
> > failed for the service that wasn't supposed to run there, and a normal one
for
> > the service that was supposed to run there:
> >
> > Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:error:
> > process_lrm_event:  Operation SRV02-opensips_stop_0 (node=PSIP-
> > SRV01-passive, call=52, status=4, cib-update=23, confirmed=true) Error
> > Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:   notice:
> > process_lrm_event:  Operation SRV01-opensips_stop_0: ok (node=PSIP-
> > SRV01-passive, call=51, rc=0, cib-update=24, confirmed=true)
> >
> >
> >> The other question is why did the stop command fail. The logs should
> >> shed some light on that too; look for the equivalent "_stop_0" operation
> >> and the messages around it. The resource agent might have reported an
> >> error, or it might have timed out.
> > I see this:
> >
> > Nov 02 23:01:24 [1689] PSIP-SRV01-passive   lrmd:  warning:
> > operation_finished: SRV02-opensips_stop_0:1983 - terminated with
signal
> 15
> > Nov 02 23:01:24 [1689] PSIP-BBT01-passive   lrmd: info:
log_finished:
> > finished - rsc: SRV02-opensips action:stop call_id:52 pid:1983 exit-code:1
> > exec-time:79ms queue-time:0ms
> >
> > As it can be seen above, the call_id for the failed stop is greater that
the
> > one with success, but ends before.
> > Also, as both operations are stopping the exact same service, the last one
> > fails. And on the case of the one that fails, it wasn't supposed to be
stopped
> > or started in that host, as was configured.
> 
> I think I see what's happening. I overlooked that SRV01-opensips and
> SRV02-opensips are using the same LSB init script. That means Pacemaker
> can't distinguish one instance from the other. If it runs "sta

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Ken Gaillot

On 11/03/2015 05:38 AM, Nuno Pereira wrote:
>> -Mensagem original-
>> De: Ken Gaillot [mailto:kgail...@redhat.com]
>> Enviada: segunda-feira, 2 de Novembro de 2015 19:53
>> Para: users@clusterlabs.org
>> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster
>>
>> On 11/02/2015 01:24 PM, Nuno Pereira wrote:
>>> Hi all.
>>>
>>>
>>>
>>> We have one cluster that has 9 nodes and 20 resources.
>>>
>>>
>>>
>>> Four of those hosts are PSIP-SRV01-active, PSIP-SRV01-passive,
>>> PSIP-SRV02-active and PSIP-SRV02-active.
>>>
>>> They should provide an lsb:opensips service, 2 by 2:
>>>
>>> . The SRV01-opensips and SRV01-IP resources should be active on
> one of
>>> PSIP-SRV01-active or PSIP-SRV01-passive;
>>>
>>> . The SRV02-opensips and SRV02-IP resources should be active on
> one of
>>> PSIP-SRV02-active or PSIP-SRV02-passive.
>>>
>>>
>>>
>>>
>>> Everything works fine, until the moment that one of those nodes is
>> rebooted.
>>> In the last case the problem occurred with a reboot of PSIP-SRV01-passive,
>>> that wasn't providing the service at that moment.
>>>
>>>
>>>
>>> To be noted that all opensips nodes had the opensips service to be started
> on
>>> boot by initd, which was removed in the meanwhile.
>>>
>>> The problem is that the service SRV01-opensips is detected to be started
> on
>>> both PSIP-SRV01-active and PSIP-SRV01-passive, and the SRV02-opensips is
>>> detected to be started on both PSIP-SRV01-active and PSIP-SRV02-active.
>>>
>>> After that and several operations done by the cluster, which include
> actions
>>> to stop both SRV01-opensips on both PSIP-SRV01-active and PSIP-SRV01-
>> passive,
>>> and to stop SRV02-opensips on PSIP-SRV01-active and PSIP-SRV02-active,
>> which
>>> fail on PSIP-SRV01-passive, the resource SRV01-opensips becomes
>> unmanaged.
>>>
>>>
>>>
>>> Any ideas on how to fix this?
>>>
>>> Nuno Pereira
>>>
>>> G9Telecom
>>
>> Your configuration looks appropriate, so it sounds like something is
>> still starting the opensips services outside cluster control. Pacemaker
>> recovers from multiple running instances by stopping them all, then
>> starting on the expected node.
> Yesterday I removed the pacemaker from starting on boot, and
> tested it: the problem persists.
> Also, I checked the logs and the opensips wasn't started on the
> PSIP-SRV01-passive machine, the one that was rebooted.
> Is it possible to change that behaviour, as it is undesirable for our
> environment?
> For example, only to stop it on one of the hosts.
> 
>> You can verify that Pacemaker did not start the extra instances by
>> looking for start messages in the logs (they will look like "Operation
>> SRV01-opensips_start_0" etc.).
> On the rebooted node I don't see 2 starts, but only 2 failed stops, the first
> failed for the service that wasn't supposed to run there, and a normal one for
> the service that was supposed to run there:
> 
> Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:error:
> process_lrm_event:  Operation SRV02-opensips_stop_0 (node=PSIP-
> SRV01-passive, call=52, status=4, cib-update=23, confirmed=true) Error
> Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:   notice:
> process_lrm_event:  Operation SRV01-opensips_stop_0: ok (node=PSIP-
> SRV01-passive, call=51, rc=0, cib-update=24, confirmed=true)
> 
> 
>> The other question is why did the stop command fail. The logs should
>> shed some light on that too; look for the equivalent "_stop_0" operation
>> and the messages around it. The resource agent might have reported an
>> error, or it might have timed out.
> I see this:
> 
> Nov 02 23:01:24 [1689] PSIP-SRV01-passive   lrmd:  warning:
> operation_finished: SRV02-opensips_stop_0:1983 - terminated with signal 15
> Nov 02 23:01:24 [1689] PSIP-BBT01-passive   lrmd: info: log_finished:
> finished - rsc: SRV02-opensips action:stop call_id:52 pid:1983 exit-code:1
> exec-time:79ms queue-time:0ms
> 
> As it can be seen above, the call_id for the failed stop is greater that the
> one with success, but ends before.
> Also, as both operations are stopping the exact same service, the last one
> fails. And on the case of the one that fails, it wasn't supposed to be stopped
> or started in that host, as was configured.

I think I see what's happening. I overlooked that SRV01-opensips and
SRV02-opensips are using the same LSB init script. That means Pacemaker
can't distinguish one instance from the other. If it runs "status" for
one instance, it will return "running" if *either* instance is running.
If it tries to stop one instance, that will stop whichever one is running.

I don't know what version of Pacemaker you're running, but 1.1.13 has a
feature "resource-discovery" that could be used to make Pacemaker ignore
SRV01-opensips on the nodes that run SRV02-opensips, and vice versa:
http://blog.clusterlabs.org/blog/2014/feature-spotlight-controllable-resource-discovery/

Alternatively, you could clone the LSB resource instead of ha

Re: [ClusterLabs] restarting resources

2015-11-03 Thread Jorge Fábregas

On 11/02/2015 12:59 PM, - - wrote:
> Is there a way to just start Website8086 or just reload it, without
> affecting the other resources.

Hi,

Try:

crm resource restart Website8086

or

crm resource stop Website8086
crm resource start Website8086

It works for me (without stopping all other resources in the same group).

-- 
Jorge

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] restarting resources

2015-11-03 Thread Andrei Borzenkov

On Mon, Nov 2, 2015 at 7:59 PM, - -  wrote:
> Hi,
>I need to be able to restart a resource (e.g apache) whenever a
> configuration
> file is updated. I have been using the 'crm resource restart ' command to to
> it,
> which does restart the resource BUT also restarts my other resources also.
> Is this normal behaviour?

Yes. If resource is restarted, all dependent resources are also restarted.

> Is there a way to just/force restart only the
> resource whose config file is changed.
>

Set resource to unmanaged, reload its configuration outside of
pacemaker, manage resource again.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] restarting resources

2015-11-03 Thread - -

Hi,
   I need to be able to restart a resource (e.g apache) whenever a
configuration
file is updated. I have been using the 'crm resource restart ' command to
to it,
which does restart the resource BUT also restarts my other resources also.
Is this normal behaviour? Is there a way to just/force restart only the
resource whose config file is changed.

I have the following resources configured in a group (ew)

 Resource Group: ew
 fs_drbd(ocf::heartbeat:Filesystem):Started ew-2
 VirtualIP  (ocf::heartbeat:IPaddr2):   Started ew-2
 mysqld-1   (ocf::heartbeat:mysql): Started ew-2
 mysqld-2   (ocf::heartbeat:mysql): Started ew-2
 Website8084(ocf::heartbeat:apache):Started ew-2
 Website8083(ocf::heartbeat:apache):Started ew-2
 Website8085(ocf::heartbeat:apache):Started ew-2
 Website8086(ocf::heartbeat:apache):Started ew-2
 tomcat8-9092   (ocf::heartbeat:tomcat):Started ew-2
 tomcat7-9091   (ocf::heartbeat:tomcat):Started ew-2
 tomcat5-9090   (ocf::heartbeat:tomcat):Started ew-2
 pgsqld-1   (ocf::heartbeat:pgsql): Started ew-2
 Website8080(ocf::heartbeat:apache):Started ew-2
 mail   (ocf::heartbeat:MailTo):Started ew-2
 p_sym_cron_rsync   (ocf::heartbeat:symlink):   Started ew-2

Now, if I restart say 'Website8086', all the other resources defined after
Website8086 in the listing above (tomcat8-9092 ... to p_sym_cron_rsync)
gets
restarted as well. This can take a while to complete.

Is there a way to just start Website8086 or just reload it, without
affecting
the other resources.

Thanks

krishan
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Nuno Pereira

> -Mensagem original-
> De: Ken Gaillot [mailto:kgail...@redhat.com]
> Enviada: segunda-feira, 2 de Novembro de 2015 19:53
> Para: users@clusterlabs.org
> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services on one cluster
> 
> On 11/02/2015 01:24 PM, Nuno Pereira wrote:
> > Hi all.
> >
> >
> >
> > We have one cluster that has 9 nodes and 20 resources.
> >
> >
> >
> > Four of those hosts are PSIP-SRV01-active, PSIP-SRV01-passive,
> > PSIP-SRV02-active and PSIP-SRV02-active.
> >
> > They should provide an lsb:opensips service, 2 by 2:
> >
> > . The SRV01-opensips and SRV01-IP resources should be active on
one of
> > PSIP-SRV01-active or PSIP-SRV01-passive;
> >
> > . The SRV02-opensips and SRV02-IP resources should be active on
one of
> > PSIP-SRV02-active or PSIP-SRV02-passive.
> >
> >
> >
> >
> > Everything works fine, until the moment that one of those nodes is
> rebooted.
> > In the last case the problem occurred with a reboot of PSIP-SRV01-passive,
> > that wasn't providing the service at that moment.
> >
> >
> >
> > To be noted that all opensips nodes had the opensips service to be started
on
> > boot by initd, which was removed in the meanwhile.
> >
> > The problem is that the service SRV01-opensips is detected to be started
on
> > both PSIP-SRV01-active and PSIP-SRV01-passive, and the SRV02-opensips is
> > detected to be started on both PSIP-SRV01-active and PSIP-SRV02-active.
> >
> > After that and several operations done by the cluster, which include
actions
> > to stop both SRV01-opensips on both PSIP-SRV01-active and PSIP-SRV01-
> passive,
> > and to stop SRV02-opensips on PSIP-SRV01-active and PSIP-SRV02-active,
> which
> > fail on PSIP-SRV01-passive, the resource SRV01-opensips becomes
> unmanaged.
> >
> >
> >
> > Any ideas on how to fix this?
> >
> > Nuno Pereira
> >
> > G9Telecom
> 
> Your configuration looks appropriate, so it sounds like something is
> still starting the opensips services outside cluster control. Pacemaker
> recovers from multiple running instances by stopping them all, then
> starting on the expected node.
Yesterday I removed the pacemaker from starting on boot, and
tested it: the problem persists.
Also, I checked the logs and the opensips wasn't started on the
PSIP-SRV01-passive machine, the one that was rebooted.
Is it possible to change that behaviour, as it is undesirable for our
environment?
For example, only to stop it on one of the hosts.

> You can verify that Pacemaker did not start the extra instances by
> looking for start messages in the logs (they will look like "Operation
> SRV01-opensips_start_0" etc.).
On the rebooted node I don't see 2 starts, but only 2 failed stops, the first
failed for the service that wasn't supposed to run there, and a normal one for
the service that was supposed to run there:

Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:error:
process_lrm_event:  Operation SRV02-opensips_stop_0 (node=PSIP-
SRV01-passive, call=52, status=4, cib-update=23, confirmed=true) Error
Nov 02 23:01:24 [1692] PSIP-SRV01-passive   crmd:   notice:
process_lrm_event:  Operation SRV01-opensips_stop_0: ok (node=PSIP-
SRV01-passive, call=51, rc=0, cib-update=24, confirmed=true)


> The other question is why did the stop command fail. The logs should
> shed some light on that too; look for the equivalent "_stop_0" operation
> and the messages around it. The resource agent might have reported an
> error, or it might have timed out.
I see this:

Nov 02 23:01:24 [1689] PSIP-SRV01-passive   lrmd:  warning:
operation_finished: SRV02-opensips_stop_0:1983 - terminated with signal 15
Nov 02 23:01:24 [1689] PSIP-BBT01-passive   lrmd: info: log_finished:
finished - rsc: SRV02-opensips action:stop call_id:52 pid:1983 exit-code:1
exec-time:79ms queue-time:0ms

As it can be seen above, the call_id for the failed stop is greater that the
one with success, but ends before.
Also, as both operations are stopping the exact same service, the last one
fails. And on the case of the one that fails, it wasn't supposed to be stopped
or started in that host, as was configured.


Might it be related to any problem with the init.d script of opensips, like an
invalid result code, or something? I checked
http://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/inis
crptact.html and didn't found any problem, but might had miss some use case.


Nuno Pereira
G9Telecom




smime.p7s
Description: S/MIME cryptographic signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Pacemaker build error

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

Re: [ClusterLabs] restarting resources

Re: [ClusterLabs] restarting resources

[ClusterLabs] restarting resources

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

8 matches

Site Navigation

Mail list logo

Footer information