Hi,

On Tue, Aug 17, 2010 at 04:50:27PM +0900, Simon Horman wrote:
> On Wed, Jul 21, 2010 at 01:41:09AM -0600, Tim Serong wrote:
> > Hi All,
> > 
> > A while ago (April, from memory), there was an ABI change in
> > clplumbing in cluster-glue.  Presumably this went mostly unnoticed
> > in general usage, however I have twice seen systems where the cluster
> > could not run because of a missing (or incorrect) libglue2 package.
> > One was my development system, with a dodgy build, the other was
> > mentioned on #linux-ha yesterday, and was the result of ignoring a
> > conflict error when installing the pacemaker RPM on openSUSE.  So,
> > let me be clear, this is not something anyone should need to worry
> > about...  But I thought I'd mention it here, because the error
> > messages you get are, IMO, not very obvious.
> > 
> > Symptoms of a mismatched pacemaker/libglue build are errors like:
> > 
> >   lrmd: [3004]: ERROR:
> >     main: can not create wait connection for command.
> >   lrmd: [3004]: ERROR:
> >     Startup aborted (can't create comm channel).  Shutting down.
> >   ...
> >   pengine: [4011]: ERROR:
> >     init_client_ipc_comms_nodispatch: Could not access channel on:
> >     /var/run/crm/pengine
> >   corosync[4000]: [pcmk  ] ERROR:
> >     pcmk_wait_dispatch: Child process pengine exited (pid=4011, rc=1)
> >   corosync[4000]: [pcmk  ] notice:
> >     pcmk_wait_dispatch: Respawning failed child process: pengine
> > 
> > If your cluster won't start and you see this in /var/log/messages,
> > make sure libglue2 is up to date.  And now that I've mentioned this
> > here and it's made it to the mailing list archive, Google will know,
> > and nobody else will ever have this problem again.
> > 
> > This has been a public service announcement.  Thank you for reading.
> 
> Could we get the .so bumped accordingly in the next release of
> cluster glue? That would at least help in managing the problem
> once the new release has been made.

I don't think that that is necessary. The ABI change in the
_released_ cluster-glue packages was done in such a way as not to
disturb the existing pacemaker installations, i.e. by adding
fields to the end of the struct. Further, the library version has
been bumped to 3:0:1 (with libtool's -version-info) at the time.
For whatever reason that translates to so.2.1.0. Users of the new
ABI are also using domain sockets of the new type if they want
the new functionality.

I guess that what Tim was seeing was Pacemaker built against the
unreleased glue versions which did have different ABI, i.e. the
fields were inserted somewhere in the middle of the struct.

Thanks,

Dejan

> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to