On Thu, Mar 5, 2009 at 15:24, Harakiri <[email protected]> wrote:
>
>
>
>
> --- On Thu, 3/5/09, Andrew Beekhof <[email protected]> wrote:
>
>> From: Andrew Beekhof <[email protected]>
>> Subject: Re: [Linux-HA] crm_mon vs cl_status
>> To: [email protected]
>> Cc: "Linux-HA mailing list" <[email protected]>
>> Date: Thursday, March 5, 2009, 6:46 AM
>> On Mar 5, 2009, at 12:39 PM, Harakiri wrote:
>> >>
>> >> YES it _is_.
>> >> The log messages above indicate the order
>> heartbeat starts
>> >> them in -
>> >> anything after that is up to the scheduler of your
>> OS.
>> >>
>> >> Regardless, the crmd and cib both have loops that
>> retry
>> >> opening
>> >> connections to the services they require - with
>> the
>> >> possible exception
>> >> of the cluster itself.
>> >
>> > But these loops dont work - as i said on other systems
>> like debian the processes are executed in the right order
>> but not here.
>> >
>> > I can manually fix the opening of pipes with adding a
>> while loop ipcsocket.c when the pipe does not exist yet - if
>> they would loop itself to try again - why isnt it working ?
>> i dont see any reference to a loop to
>> >
>> > struct IPC_CHANNEL *
>> > socket_client_channel_new(GHashTable *ch_attrs)
>> >
>> > where is it?
>>
>> the loops i'm talking about are at a much higher level
>> - i've no knowledge of how the IPC code works.
>> eg. do_cib_control() arranges for the crmd to try
>> connecting to the cib up to 30 times before giving up.
>>
>> it sounds like the solaris equivalent of
>> socket_client_channel_new() isnt failing properly.
>
> Yes - when i compile on sparc10 with sockets enabled instead of pipes the 
> loops are working :
>
> cib[19975]: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Activation failed
> cib[19975]: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Connection failed 1 
> times (30 max)
> cib[19975]: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Activation failed
> cib[19975]: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Connection failed 2 
> times (30 max)
>
>
> but this never happends when pipes are used, since pipes are also controled 
> in the same socket_client_channel_new there is no difference - if either 
> socket or pipes fail NULL is returned - in crm/crmd/ccm.c i found the retry 
> code - i have no idea why it would fail - maybe an exception is thrown 
> somewhere in between?!

No such thing in C.
Perhaps its the function that calls socket_client_channel_new() thats
the problem... or the one that calls that...
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to