Re: [ClusterLabs] Antw: Re: Antw: Delayed first monitoring

2015-08-13 Thread Digimer
On 13/08/15 04:38 AM, Ulrich Windl wrote:
 Miloš Kozák  schrieb am 13.08.2015 um 09:56 in
> Nachricht
> <55cc4daa.4020...@lejmr.com>:
> 
>>
>> Dne 13.8.2015 v 09:26 Andrei Borzenkov napsal(a):
>>> On Thu, Aug 13, 2015 at 10:01 AM, Miloš Kozák 
> wrote:
 However,
   this does not make sense at all. Presumably, the pacemaker should get 
>> along
 with lsb scripts which comes from system repository, right?

>>> Let's forget about pacemaker for a moment. You have system startup
>>> where service B needs service A. initscript for service A completes
>>> and script for service B is started but service A is not yet ready to
>>> be used.
>>>
>>> This is a bug in startup script. Irrespectively of whether you use it
>>> with pacemaker or not.
>>
>> I am sorry, but I didnt get the point..
>>
>> If service A is not ready then service B should not be started. 
> 
> As you seem to be ignorant for advice:

Ok, I'm starting to get annoyed now. You need to be more polite and
respectful on this list.

> Yes, you are right: Service B should check whether service A is up before
> starzing itself.
> The easy change for the start script of B is to find aout what command was run
> before it to check whether the command before did everything OK by checking
> again itself.
> 
> [...]
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Delayed first monitoring

2015-08-13 Thread Jan Pokorný
On 13/08/15 10:38 +0200, Ulrich Windl wrote:
 Miloš Kozák  schrieb am 13.08.2015 um 09:56 in
>>> Nachricht <55cc4daa.4020...@lejmr.com>:
> 
>> 
>> Dne 13.8.2015 v 09:26 Andrei Borzenkov napsal(a):
>>> On Thu, Aug 13, 2015 at 10:01 AM, Miloš Kozák 
>>> wrote:
 However, this does not make sense at all. Presumably, the
 pacemaker should get along with lsb scripts which comes from
 system repository, right?
 
>>> Let's forget about pacemaker for a moment. You have system startup
>>> where service B needs service A. initscript for service A completes
>>> and script for service B is started but service A is not yet ready to
>>> be used.
>>> 
>>> This is a bug in startup script. Irrespectively of whether you use it
>>> with pacemaker or not.
>> 
>> I am sorry, but I didnt get the point..
>> 
>> If service A is not ready then service B should not be started. 
> 
> As you seem to be ignorant for advice:
> Yes, you are right: Service B should check whether service A is up before
> starzing itself.
> The easy change for the start script of B is to find aout what command was run
> before it to check whether the command before did everything OK by checking
> again itself.
> 
> [...]

The harder task for the sketched, relaxed (not strictly serialized, at
least per prerequisite-ordering) environment is for service B aware of
its prerequisite-ordered predecessor A to (also) decide if A is not by
any chance just proceeding with a startup sequence -- something
requiring a very detailed knowledge of its internals and being
prone to race-conditions anyway.

Hence reasonable, high-level, init systems require such startup
sequences to be completely finished by the time they acknowledge
service at hand as "started" and allow prerequisite-ordered successor
to join the game too.  Consequently, the responsibility for such
"is finished with startup (successfully or not)?" is deferred to the
lower-level dedicated startup recipes that should then signal this
back to the init system (e.g., by finishing only when the startup
is over) credibly to prevent mess ups.

Going full circle, if such assumption is broken in httpd initscript,
it should be fixed.

-- 
Jan (Poki)


pgpYUzkQvpoLi.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Delayed first monitoring

2015-08-13 Thread Ulrich Windl
>>> Miloš Kozák  schrieb am 13.08.2015 um 09:56 in
Nachricht
<55cc4daa.4020...@lejmr.com>:

> 
> Dne 13.8.2015 v 09:26 Andrei Borzenkov napsal(a):
>> On Thu, Aug 13, 2015 at 10:01 AM, Miloš Kozák 
wrote:
>>> However,
>>>   this does not make sense at all. Presumably, the pacemaker should get 
> along
>>> with lsb scripts which comes from system repository, right?
>>>
>> Let's forget about pacemaker for a moment. You have system startup
>> where service B needs service A. initscript for service A completes
>> and script for service B is started but service A is not yet ready to
>> be used.
>>
>> This is a bug in startup script. Irrespectively of whether you use it
>> with pacemaker or not.
> 
> I am sorry, but I didnt get the point..
> 
> If service A is not ready then service B should not be started. 

As you seem to be ignorant for advice:
Yes, you are right: Service B should check whether service A is up before
starzing itself.
The easy change for the start script of B is to find aout what command was run
before it to check whether the command before did everything OK by checking
again itself.

[...]


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Delayed first monitoring

2015-08-13 Thread Ulrich Windl
>>> Miloš Kozák  schrieb am 13.08.2015 um 09:01 in
Nachricht
<55cc40b1.7090...@lejmr.com>:
> However,
>   this does not make sense at all. Presumably, the pacemaker should get 
> along with lsb scripts which comes from system repository, right?

I don't think pacemaker is a technology to handle broken start scripts.
Beining able to use LSB scripts probably was there to manage something without
having an OCF RA for it, but we generally don't use LSB scripts for HA here.

> 
> Therefore, there is not way how to modify lsb script because changes is 
> lsb script erase after every package update.

That's double nonsense: Contact your support to get the scripts fixed. Despite
of that there is absolutely no reason not to copy the script under a different
name and fix that.

> 
> 
> I believe, the systematical approach is in introducing of delayed 
> monitoring or something like this into Pacemaker. I quite wonder that 
> nobody has come around this problem already?

Maybe because there's an apache RA?
# man -k apache | grep -i OCF
ocf_heartbeat_apache (7) - Manages an Apache Web server instance

Regards,
Ulrich

> 
> 
> Milos
> 
> 
> 
> 
> 
> Dne 13.8.2015 v 08:44 Ulrich Windl napsal(a):
>> I think the start script has to be fixed to return success when httpd is
>> actually running.
>>
> Miloš Kozák  schrieb am 12.08.2015 um 16:03 in
>> Nachricht
>> <55cb521a.8090...@lejmr.com>:
>>> Hi,
>>>
>>> I have set up and CoroSync+CMAN+Pacemaker at CentOS 6.5 in order to
>>> provide high-availability of opennebula. However, I am facing to a
>>> strange problem which raises from my lack of knowleadge..
>>>
>>> In the log I can see that when I create a resource based on an init
>>> script, typically:
>>>
>>> pcs resource create httpd lsb:httpd
>>>
>>> The httpd daemon gets started, but monitor is initiated at the same time
>>> and the resource is identified as not running. This behaviour makes
>>> sense since we realize that the daemon starting takes some time. In this
>>> particular case, I get error code 2 which means that process is running,
>>> but environment is not locked. The effect of this is that httpd resource
>>> gets restarted.
>>>
>>> My workaround is extra sleep in status function of the init script, but
>>> I dont like this solution at all! Do you have idea how to tackle this
>>> problem in a proper way? I expected an op attribut which would specify
>>> delay after service start and first monitoring, but I could not find it..
>>>
>>> Thank you, Milos
>>>
>>>
>>> ___
>>> Users mailing list: Users@clusterlabs.org 
>>> http://clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org