On Thu, Mar 5, 2009 at 15:24, Harakiri <[email protected]> wrote: > > > > > --- On Thu, 3/5/09, Andrew Beekhof <[email protected]> wrote: > >> From: Andrew Beekhof <[email protected]> >> Subject: Re: [Linux-HA] crm_mon vs cl_status >> To: [email protected] >> Cc: "Linux-HA mailing list" <[email protected]> >> Date: Thursday, March 5, 2009, 6:46 AM >> On Mar 5, 2009, at 12:39 PM, Harakiri wrote: >> >> >> >> YES it _is_. >> >> The log messages above indicate the order >> heartbeat starts >> >> them in - >> >> anything after that is up to the scheduler of your >> OS. >> >> >> >> Regardless, the crmd and cib both have loops that >> retry >> >> opening >> >> connections to the services they require - with >> the >> >> possible exception >> >> of the cluster itself. >> > >> > But these loops dont work - as i said on other systems >> like debian the processes are executed in the right order >> but not here. >> > >> > I can manually fix the opening of pipes with adding a >> while loop ipcsocket.c when the pipe does not exist yet - if >> they would loop itself to try again - why isnt it working ? >> i dont see any reference to a loop to >> > >> > struct IPC_CHANNEL * >> > socket_client_channel_new(GHashTable *ch_attrs) >> > >> > where is it? >> >> the loops i'm talking about are at a much higher level >> - i've no knowledge of how the IPC code works. >> eg. do_cib_control() arranges for the crmd to try >> connecting to the cib up to 30 times before giving up. >> >> it sounds like the solaris equivalent of >> socket_client_channel_new() isnt failing properly. > > Yes - when i compile on sparc10 with sockets enabled instead of pipes the > loops are working : > > cib[19975]: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Activation failed > cib[19975]: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Connection failed 1 > times (30 max) > cib[19975]: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Activation failed > cib[19975]: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Connection failed 2 > times (30 max) > > > but this never happends when pipes are used, since pipes are also controled > in the same socket_client_channel_new there is no difference - if either > socket or pipes fail NULL is returned - in crm/crmd/ccm.c i found the retry > code - i have no idea why it would fail - maybe an exception is thrown > somewhere in between?!
No such thing in C. Perhaps its the function that calls socket_client_channel_new() thats the problem... or the one that calls that... _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
