Luke Pascoe escribió:
Err, false alarm.

I've discovered it's not HA rebooting the box, but ocfs2 instead. :(
It is more normal. is ocfs2 starting when heartbeat starts on NODE02 ?? I think not because you would need a HA V2 to do this no ? If you will use a HA V2 , you will need some constraints to do migration of resource and you will need write this constraints well to do a normal start. You can see this: http://www.linux-ha.org/v2/Concepts/MultiState


Now I've gotta figure out what's going on with that.

On the off chance, has anyone here done HA with OCFS2? Any gotchas?
Sorry, I don't implement anything with OCFS2 only some with DRBD.

Regards

Luke Pascoe
Linux Systems Engineer
Asterisk

T 09 366 8835
F 09 302 1772
M 0274 266649
E [EMAIL PROTECTED]
W www.asterisk.co.nz

Level 9, Gen-i Tower, 66 Wyndham Street
PO Box 8804, Auckland, New Zealand


Luke Pascoe wrote:
 > How do you induce a failover?

Right now just the simplest kind, a network failure. I'm simply disconnecting the LAN interface on one of the hosts.

It doesn't matter which host I disconnect, 02 reboots.

 > Anything in the logs? Perhaps post them.

Nothing obvious. Attached are the ha-log and ha-debug files from NODE01 and NODE02. NODE02 is primary and has the resources, I disconnect eth0 on NODE01, NODE02 acquires it's resources (there are none) and then promptly reboots.

:(

Regards

Luke Pascoe
Linux Systems Engineer
Asterisk

T 09 366 8835
F 09 302 1772
M 0274 266649
E [EMAIL PROTECTED]
W www.asterisk.co.nz

Level 9, Gen-i Tower, 66 Wyndham Street
PO Box 8804, Auckland, New Zealand


Dejan Muhamedagic wrote:
Hi,

On Tue, Feb 26, 2008 at 04:35:06PM +1300, Luke Pascoe wrote:
Hello

I'm trying to do NFS failover in a test environment with an underlying OCFS2 filesystem. This is something that's apparently been done before and certainly HA NFS isn't new.

I've followed several HowTos, all of which seem to suggest pretty much the same setup, but I seem to get the same problem no matter how I configure it.

Here's the setup:

2 VMWare VMs (NODE01 and NODE02) running RHEL4 U5 x86_64 with a shared fibre channel SAN volume. That volume is OCFS2 formatted and mounted as /data on both hosts.

Each host has 2 interfaces. An external facing 10.0.0.0/24 and an internal 192.168.100.0/30

Heartbeat is installed on both nodes and configured identically as follows:

=ha.cf=
keepalive 2
deadtime 30
warntime 10
initdead 120
bcast   eth0
ucast eth1 192.168.100.1 # Obviously on NODE02 this is 192.168.100.2
auto_failback on
node NODE02 NODE01
 ping 10.0.0.133 # this is an unrelated LAN host
respawn hacluster /usr/lib64/heartbeat/ipfail
   use_logd yes
 crm off
=/ha.cf=

=haresources=
NODE01 10.199.133.90 nfslock nfs_wrapper
=/haresources=

BTW, crm is off because I tried it with it on and got EXACTLY the same result.

Here's the problem:

Both hosts start up just fine, NODE01 picks up all 3 resources and everything's roses. If I induce a failure on NODE01, NODE02 correctly acquires the resources and everything is still roses. HOWEVER, ~30 seconds later NODE02 reboots. Now the odd thing is, it doesn't matter which is the primary node, or which host has the failure or even which has the resources, NODE02 always reboots when there's a failure.

How do you induce a failover?

Even if the resources are started on NODE02 and NODE01 has a failure (ie, everything should stay as it is, no failover required) 30 seconds after the failure NODE02 reboots!!!

I've got NTP syncing the time, so it's not a clock issue, and I've tried twiddling just about every setting in the config, to no avail.

Any help? Please?

Anything in the logs? Perhaps post them.

Thanks,

Dejan

--
Regards

Luke Pascoe
Linux Systems Engineer
Asterisk

T 09 366 8835
F 09 302 1772
M 0274 266649
E [EMAIL PROTECTED]
W www.asterisk.co.nz

Level 9, Gen-i Tower, 66 Wyndham Street
PO Box 8804, Auckland, New Zealand
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


------------------------------------------------------------------------

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to