libvirt on KVM

Lars Ellenberg Fri, 24 Feb 2012 02:51:51 -0800

On Fri, Feb 24, 2012 at 10:26:42AM +0100, Andreas Kurz wrote:
>On Thu, Feb 23, 2012 at 08:59:44AM -0500, Tom Hanstra wrote:
>>On Wed, Feb 22, 2012 at 11:48:17PM +0100, Andreas Kurz wrote:
>>> > But is corosync better than heartbeat?  Or am I getting into a religious 
>>> > war by asking that?
>>> 
>>> Since heartbeat is not actively developed any more, corosync is the way
>>> to go for a future proof setup.


>> Hmmm, this is something which I did not understand when starting to look 
>> into this.  If this is the case, it would be nice if the web pages were 
>> updated accordingly.

> You mean linux-ha.com? ... yeah that might be true. But looking at
> clusterlabs.org makes it quite clear, that corosync is the way to go for
> new setups ... there are also some nice faqs:
> 
> http://clusterlabs.org/wiki/FAQ


My current take on heartbeat vs corosync, even with the
heartbeat "steward and maintainer" hat on:

What is relevant for Pacemaker clusters:

Heartbeat
  * Heartbeat is no longer actively developed,
    and it does not look like that will change.
  
  * Heartbeat *is* maintained, and will stay maintained for
    the foreseeable future.
  
  * please use 3.0.5 (current mercurial)
    or you may get funny "busy loops" if you experience packet
    loss on the link.
  
  * is limited in message size; the cib can grow quite large.
    Once even the bz2 compressed cib (inclusive status section)
    grows beyond the payload of a single UDP packet, your
    Pacemaker on Heartbeat will break horribly.
    - That could be overcome, but that would mean development.
      And someone would need to do that.
      There would need to be a good motivation to do that...
      I don't see it happen soon, or at all.
  
  * I know of a glib callback priority inversion,
    where heartbeat, in presence of packet loss,
    may not recognize a "node dead" event, because it is too busy
    requesting retransmits of the last lost packets...
    I think I have that fixed, but it is not yet in the repo.
    Should be "soon", and released as 3.0.6
  
  * It has a strange behaviour if you ifconfig down an interface,
    then ifconfig up it again (I'm not talking about unplugging
    or switch down or anything, but really about setting the link
    as down in linux).  It may take ~20 minutes to be able to
    really use that interface again.  I know why that is, I may
    fix that too, but the short story is "don't do that".
  
  * the membership algorithm (cluster consensus membership)
    is somewhat "ad hoc-ish", but very robust.
  
  * Pacemaker on Heartbeat handles "cluster partition merge"
    as good as it gets, or "as expected".
  
  * Heartbeat supports TIPC and other "exotic" protocols,
    for those interested to run pacemaker on a TIPC stack.

Corosync
  * Corosync is actively developed,
    and in some cases can be a "moving target".
  * has improved *a lot* in stability and features since 1.2/1.3.
    Current 1.4 is good.
  * The algorithm used is well understood and documented.
    I think it is much more sensitive to latency, packet loss or timeouts,
    so you better make sure your network matches the requirements
    even under heavy load and memory pressure. And configure your
    timeouts on the conservative side.
  * 2.0 will of course bring "all new and improved bugs",
    that is the nature of development.  But given all the legs behind
    it, whatever issues may crop up, they will be fixed very quickly.
  * does not suffer the single UDP message size limit, I think the
    message size limit is ~1 MByte (vs <= 64k in heartbeat)

  * is required for DLM/cLVM/cluster file systems and the like  
  
  * At this time, the only "very ugly behaviour" I know of with
    Pacemaker on Corosync is on "cluster partition merge":

    If nodes are declared dead, not fenced (because of no fencing
    enabled, or fencing not working, or not working fast enough),
    and then see each other again, corosync and pacemaker do
    not agree on membership, and the cluster does not recover.
  
    There is also at least one bugzilla on that:
    https://bugzilla.redhat.com/show_bug.cgi?id=752477
  
    It is unclear (to me) if this is a shortcoming in the
    implementation of cluster partition merge in corosync, a bug
    in the pacemaker <-> corosync interaction, or both.

    I've seen a few commits in pacemaker lately that may be related.
  
    Pacemaker on heartbeat behaves as expected in the same situation.
  
    As long as you have tested and working fencing implemented,
    that should not affect you at all.


Portability:
I don't know about portability of corosync, I know that heartbeat
(used to be) portable to just about any unix-like thing out there.
That's probably only relevant to very few people, though.

My conclusion:
    Those building small clusters (handful of resources, small number of
    nodes), no cluster file system, not DLM involved,
    and trying to get away without stonith: please use heartbeat.
    Unless the mentioned behaviour is fixed meanwhile...

    Everyone else, **with tested and working stonith**,
    for new deployments: use corosync >= 1.4.2

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] stonith/fence using external/libvirt on KVM

Reply via email to