On Tue, Aug 17, 2010 at 06:12:04PM -0600, Tim Serong wrote:
> On 8/18/2010 at 09:03 AM, Simon Horman <[email protected]> wrote: 
> > On Tue, Aug 17, 2010 at 03:06:45PM +0200, Dejan Muhamedagic wrote: 
> > > Hi, 
> > >  
> > > On Tue, Aug 17, 2010 at 04:50:27PM +0900, Simon Horman wrote: 
> > > > On Wed, Jul 21, 2010 at 01:41:09AM -0600, Tim Serong wrote: 
> > > > > Hi All, 
> > > > >  
> > > > > A while ago (April, from memory), there was an ABI change in 
> > > > > clplumbing in cluster-glue.  Presumably this went mostly unnoticed 
> > > > > in general usage, however I have twice seen systems where the cluster 
> > > > > could not run because of a missing (or incorrect) libglue2 package. 
> > > > > One was my development system, with a dodgy build, the other was 
> > > > > mentioned on #linux-ha yesterday, and was the result of ignoring a 
> > > > > conflict error when installing the pacemaker RPM on openSUSE.  So, 
> > > > > let me be clear, this is not something anyone should need to worry 
> > > > > about...  But I thought I'd mention it here, because the error 
> > > > > messages you get are, IMO, not very obvious. 
> > > > >  
> > > > > Symptoms of a mismatched pacemaker/libglue build are errors like: 
> > > > >  
> > > > >   lrmd: [3004]: ERROR: 
> > > > >     main: can not create wait connection for command. 
> > > > >   lrmd: [3004]: ERROR: 
> > > > >     Startup aborted (can't create comm channel).  Shutting down. 
> > > > >   ... 
> > > > >   pengine: [4011]: ERROR: 
> > > > >     init_client_ipc_comms_nodispatch: Could not access channel on: 
> > > > >     /var/run/crm/pengine 
> > > > >   corosync[4000]: [pcmk  ] ERROR: 
> > > > >     pcmk_wait_dispatch: Child process pengine exited (pid=4011, rc=1) 
> > > > >   corosync[4000]: [pcmk  ] notice: 
> > > > >     pcmk_wait_dispatch: Respawning failed child process: pengine 
> > > > >  
> > > > > If your cluster won't start and you see this in /var/log/messages, 
> > > > > make sure libglue2 is up to date.  And now that I've mentioned this 
> > > > > here and it's made it to the mailing list archive, Google will know, 
> > > > > and nobody else will ever have this problem again. 
> > > > >  
> > > > > This has been a public service announcement.  Thank you for reading. 
> > > >  
> > > > Could we get the .so bumped accordingly in the next release of 
> > > > cluster glue? That would at least help in managing the problem 
> > > > once the new release has been made. 
> > >  
> > > I don't think that that is necessary. The ABI change in the 
> > > _released_ cluster-glue packages was done in such a way as not to 
> > > disturb the existing pacemaker installations, i.e. by adding 
> > > fields to the end of the struct. Further, the library version has 
> > > been bumped to 3:0:1 (with libtool's -version-info) at the time. 
> > > For whatever reason that translates to so.2.1.0. Users of the new 
> > > ABI are also using domain sockets of the new type if they want 
> > > the new functionality. 
> > >  
> > > I guess that what Tim was seeing was Pacemaker built against the 
> > > unreleased glue versions which did have different ABI, i.e. the 
> > > fields were inserted somewhere in the middle of the struct. 
> >  
> > Ok, so no ABI incompatibility was introduced in 1.0.6. Great! 
> > I will go ahead and close the related Debian bugs, 
> > #593319, #593321, #593322 and #593323. 
> 
> I was seeing Pacemaker *built* against new glue, installed on a system
> that had *old* glue installed, because both libglue2 (new glue) and
> libheartbeat2 < 3.0 (old glue) provide what looks like the same DSO;
> so when Pacemaker was upgraded on this system, libheartbeat2 was not
> automatically upgraded to libglue2.  For reference, there's an
> openSUSE 11.3 bug for this:
> 
>   https://bugzilla.novell.com/show_bug.cgi?id=628243
> 
> I believe this may only be a problem on openSUSE 11.3, where heartbeat
> 2.99.3 still exists, providing old libheartbeat2.
> 
> It shouldn't be a problem the other way around (i.e. old Pacemaker is
> meant to work with new glue, as Dejan said).

Understood.

Was the new glue that you used for building a released version
or an hg snapshot?

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to