Re: [Linux-HA] explain the difference between servers?

2010-05-31 Thread Nikita Michalko
Hi mike, it seems to be no HA-problem anymore though, but: Am Montag, 31. Mai 2010 01:29 schrieb mike: So I've got ldirector up and running just fine and providing ldap high availability to 2 backend real servers on port 389. Here is the output of netstat on both real servers: tcp

Re: [Linux-HA] Colocation, location, auto-failback=off

2010-05-31 Thread Andrew Beekhof
On Sat, May 29, 2010 at 3:54 AM, Diego Woitasen dieg...@xtech.com.ar wrote: Hi,  * I have three nodes: ha1, ha2 y ha3.  * Three resources: sfex, xfs_fs, ip.  * sfex and xfs_fs are members of a group called xfs_grp.  * xfs_grp can run on any node but ip resource can run on ha1 or ha2 only.  

Re: [Linux-HA] Problem with migration on a nfs/exportfs setup while copying via rsync

2010-05-31 Thread Dejan Muhamedagic
Hi, On Fri, May 28, 2010 at 02:16:22PM +0200, RaSca wrote: Il giorno Ven 28 Mag 2010 12:34:06 CET, RaSca ha scritto: [...] Note that the nfs-kernel-server isn't connected to the exportfs, but is only a cloned resource, so it isn't touched by the migration process. [...] Ok Dejan, I've

Re: [Linux-HA] FSCK Error

2010-05-31 Thread Dejan Muhamedagic
Hi, On Thu, May 27, 2010 at 11:59:01AM -0400, Chris May wrote: I have a basic 2 node cluster running Apache / Drbd / . When the Filesystem Resource attempts to start the error that comes threw the logs is Couldnt sucessfully fsck filesystem for /dev/mapper/VolGroup-drbd . By the way this

Re: [Linux-HA] explain the difference between servers?

2010-05-31 Thread mike
Nikita Michalko wrote: Hi mike, it seems to be no HA-problem anymore though, but: Am Montag, 31. Mai 2010 01:29 schrieb mike: So I've got ldirector up and running just fine and providing ldap high availability to 2 backend real servers on port 389. Here is the output of netstat on

Re: [Linux-HA] Colocation, location, auto-failback=off

2010-05-31 Thread Diego Woitasen
On Mon, May 31, 2010 at 3:58 AM, Andrew Beekhof and...@beekhof.net wrote: On Sat, May 29, 2010 at 3:54 AM, Diego Woitasen dieg...@xtech.com.ar wrote: Hi, * I have three nodes: ha1, ha2 y ha3. * Three resources: sfex, xfs_fs, ip. * sfex and xfs_fs are members of a group called

Re: [Linux-HA] Problem with migration on a nfs/exportfs setup while copying via rsync

2010-05-31 Thread RaSca
Il giorno Lun 31 Mag 2010 13:14:17 CET, Dejan Muhamedagic ha scritto: [...] My guess is that the timeout you set is too short. Not sure, but I think that somebody mentioned that it takes at least 80 seconds for the nfsd v4 to really stop. Was nfsd being stopped here at all? Thanks, As I

[Linux-HA] Active-Active nfs storage

2010-05-31 Thread RaSca
Hi all, I have a cluster with two nodes configured to mount two drbd, with LVM and filesystem. I need to put each drbd on a different node, for an active-active setup, like a storage, so I have two groups like these: group share-a share-a-ip share-a-LVM share-a-fs group share-b share-b-ip

Re: [Linux-HA] explain the difference between servers?

2010-05-31 Thread mike
mike wrote: Nikita Michalko wrote: Hi mike, it seems to be no HA-problem anymore though, but: Am Montag, 31. Mai 2010 01:29 schrieb mike: So I've got ldirector up and running just fine and providing ldap high availability to 2 backend real servers on port 389. Here is

Re: [Linux-ha-dev] Monitoring Process Death

2010-05-31 Thread Lars Ellenberg
On Fri, May 28, 2010 at 04:09:03PM -0700, Bob Schatz wrote: Thanks Lars and Dejan for your feedback. I have started reading the lrmd source. Good old inittab, anyone? One thing I am worried about is that if I give a PID to lrmd, how will lrmd monitor it? My RA is a shell script that

Re: [Linux-ha-dev] Monitoring Process Death

2010-05-31 Thread Lars Marowsky-Bree
On 2010-05-28T16:09:03, Bob Schatz bsch...@yahoo.com wrote: I have started reading the lrmd source. One thing I am worried about is that if I give a PID to lrmd, how will lrmd monitor it? My RA is a shell script that forks off a daemon. If I give this daemon PID to lrmd does lrmd

Re: [Linux-ha-dev] Upstart RA

2010-05-31 Thread Lars Marowsky-Bree
On 2010-05-17T08:40:51, Andrew Beekhof and...@beekhof.net wrote: Exit codes weren't implemented since upstart knows a bit more states than just 'running' or 'not running', i.e. it knows distinction between running, but stopping and running. Which is still no excuse for them not doing exit

Re: [Linux-ha-dev] Monitoring Process Death

2010-05-31 Thread Lars Ellenberg
On Mon, May 31, 2010 at 04:47:43PM +0200, Lars Marowsky-Bree wrote: On 2010-05-31T11:45:37, Lars Ellenberg lars.ellenb...@linbit.com wrote: Use the anything resource agent, and define a monitor action script of your choice? or put a loop in your script, and restart whatever is necessary

Re: [Linux-ha-dev] Monitoring Process Death

2010-05-31 Thread Lars Marowsky-Bree
On 2010-05-31T18:16:30, Lars Ellenberg lars.ellenb...@linbit.com wrote: There are several flavors of overhead. One underestimated is programming and code maintenance overhead ;-) Why would we register pid with lrm and duplicate code from heartbeat proper to lrmd and whatnot, or even rewrite

Re: [Linux-ha-dev] Monitoring Process Death

2010-05-31 Thread Lars Ellenberg
On Mon, May 31, 2010 at 10:48:02PM +0200, Lars Marowsky-Bree wrote: ... You're embedding policy into code here. Who said anything about the proper response being the daemon being restarted being the right response? Maybe the whole point is to initiate full PE-level recovery? Of course you