Re: [Linux-HA] if promote runs into timeout

Maloja01 Tue, 10 Jan 2012 07:58:43 -0800

On 01/10/2012 04:46 PM, erkan yanar wrote:
> Hi Fabian,
> 
> with unmanaged I loose all the other nice features I would like to have and
> why I use pacemaker.

It depends. I am using pacemaker to have take-overs if a hardware fails.
But when I need to "block" because a resource can have (nearly)
unlimited timeouts, than unmanaged makes sense and works.
The other way would be to know what the longest allowed time out (from
the end-customer perspective would be).

Example:
Lets say that a typical start takes 10 seconds. You set the timeout to
15seconds and so it /should/ work for most cases. Unfortunately in your
case (as I understand) the timeout could also be 100 seconds or one
hour. Whats against to set the timeout to 10hours and to define that
thats really the maximum? Thats static and could be a problem when
10hours are over, but a resource start should (on the other hand) not
take 10 hours to get the end-customers service working. In such a case
even pacemaker could not do anything.

Kind regards
Fabian

> 
> But nice tip!
> 
> Thx
> Erkan :)
> 
> On Sat, Jan 07, 2012 at 10:22:32AM +0100, Maloja01 wrote:
>> In an other customer setup we decided to set a resource to status
>> "unmanaged" when it has to do some special work which should not be
>> interrupted. After the replication (in our case redloogs in a backup db)
>> we set the resource to be managed again.
>>
>> I never have tried to change already triggered timeouts.
>>
>> Kind regards
>> Fabian
>>
>>
>> On 01/06/2012 06:14 PM, erkan yanar wrote:
>>>
>>> Moin,
>>>
>>> Im having the issue, that promoting a master can run into the promote 
>>> timeout.
>>> After that, the resource is stopped and started as a slave.
>>>
>>> In my example it is a mysql resource, where promoting is going to  wait for 
>>> any replication lag to be
>>> applied. This could last a very long time.
>>>
>>> There are some thoughts on that issue:
>>> 1. Dynamically increase the timeout with cibadmin. I havent tested that 
>>> yet. Would this work?
>>> 2. op-fail=ignore
>>> With ignore, the resource is not restarted. But I don't like that approach.
>>>
>>> Is there an intelligent approach to dynamically change the timeout while 
>>> promoting?
>>> Or is there a better approach anyway?
>>>
>>>
>>> Regards
>>> Erkan
>>>
>>>
>>>
>>>
>>>
>>

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] if promote runs into timeout

Reply via email to