On Fre, Sep 05, 2008, Ante Karamatic wrote: > On Fri, 5 Sep 2008 12:51:42 +0200 > Chris Joelly <[EMAIL PROTECTED]> wrote: > > Moving services isn't an issue here (you could remove all services from > node with /etc/init.d/rgmanager stop). This problem is related with > cluster membership. I don't know exactly where the problem is (I'm > just a user, not developer :).
unfortunately no developers out there reading our posts :-) i posted to linux-cluster list too but no recommendations yet. i'm very enthusiastic tracking down problems, but i'm mainly used to track down java related problems as its my main occupation ;-) > I'll repeat once more, having only two nodes in cluster is worst > possible scenario for RHCS. But you then have to use some other shared storage, DRBD won't work with more than 2 nodes. and thats too expensive for the actual project ... > I wouldn't use it on two-node cluster if I really don't have to (but I > do in one case), but it's far away from useless. It's great :) The same > problem exist on all distributions (FWIW my crappy two node cluster is > on RedHat and all others are on Ubuntu). This means that its better to switch to heartbeat managed services in an active/passive manner? At least for a two node setup? > Since RHCS isn't aware of DRBD, you can't really rely on it to handle > GFS mount. This is why I don't manage GFS mounts with RHCS. I rather > mount GFS on both machines and then let the services read it when they > need to. For example: > > If I have two apache nodes, then I mount /var/www as GFS on both > (underneath this GFS is a DRBD device with both nodes in > primary-primary). As soon as first node dies, service is started on the > other node. RHCS doesn't manage my /var/www mount. ok. so you define the services as "not auto start" in cluster.conf so that your are able to bring up the underlaying drbd-clvm-gfs stuff. as i conclude: if the node a service runs on fails the failover node (in a 2-node scenario) fences the failed node and takes the service of the failed node. then when the failed node recovers, either from a reboot or from manually intervention you first bring up the GFS mounts and then move the service back to the re-joined node? sounds reasonable... ;) and avoids the requirement of the rgmanager to check if a 'shared' resource (GFS in this case) is already activated by another service on the same node ... But i have something left open, at least in my head... how do i safely remove one node from a running cluster, so that the services on the remaining node keep running. Chris -- ubuntu-server mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-server More info: https://wiki.ubuntu.com/ServerTeam
