also take a look at the heartbeat package at linux-ha.org This works on linux, *BSD, and solaris (there were people working on a AIX port, but they apparently dropped it shortly before finishing)

David Lang

On Wed, 15 Sep 2004, Jure [UTF-8] PeÄ~Mar wrote:

Date: Wed, 15 Sep 2004 17:07:20 +0200
From: "Jure [UTF-8] PeÄ~Mar" <[EMAIL PROTECTED]>
To: Paul Dekkers <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: Cyrus crashed on redundant platform - need better availability?

On Wed, 15 Sep 2004 13:38:43 +0200
Paul Dekkers <[EMAIL PROTECTED]> wrote:

But I suppose RH's cluster manager takes care of mounting the partitions
and checking them if there are any errors.

Not really, at least not by itself. See http://people.redhat.com/jrfuller/cms/ for detailed documentation of what is included with RH AS 2.1 (it's some $500 extra for AS 3). I had to write some pretty paranoid scripts that take care of assembling software raids, checking the fs and mountig it while taking care about the other machine to prevent problems.

Of course all this would be much easier with some kind of clustered fs, but
clustered fs brings a new problem: locking. Almost all i've seen so far have
an external 'locking manager' on a separate box, which brings ethernet
latency into every lock operation, which i'm sure is very noticable in the
lock-heavy usage patterns as mail is. But this is just my feeling, i haven't
yet benchmarked any :)

Do you think using RH's cluster software is a valuable consideration for
this kind of clustering setup? Using FreeBSD there are not that many
clustering solutions for now, and if it's advisable to at least consider
using RH here (although I have no experience with RH) we can certainly
look at it. (Any idea how fast RH would "recover services"?)

This RH cluster software is nothing fancy; i'm sure equivalents exists for BSDs. See documentation link above. Actually it is just Kimberlite (http://oss.missioncriticallinux.com/projects/kimberlite/), sold with RedHat support. "Speed" of recovery is almost completely out of the cluster control. The only thing that matters for the cluster is what your cyrus init script returns when called with 'status' parameter. Everything else is up to your init scripts. Of course, if one box dies completely, the other takes over in the configurable time.


--

Jure Peÿÿar
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


-- There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Reply via email to