Again, create a bug on oss.oracle.com/bugzilla and upload
the messages files from both nodes. It is hard to state anything
with incomplete information.

[EMAIL PROTECTED] wrote:
I decided to rebuild this from scratch today and got the same result.

two cluster node, both boxes remain connected to the shared storage
throughout tests.

I unplug network connection from node0 and get e1000 driver "Tx Unit Hang"
messages on node0 console
node1 console displays "o2net_idle_timer:1309 here are some times to help
debug the situation" followed by additional output
node1 sits for a while and eventually displays "o2quo_make_decision:143
error: fencing this node because it is connected to a half-quorum of one of
two nodes which doesn't include the lowest active node 0"
node 0 replays node 1's journal, too bad it still isn't on the network

this is in node 1 /var/log/messages after reboot

Nov 14 23:55:56 FTP02 kernel: o2net: connection to node FTP01.mydomain.net
(num 0) at 10.xxx.0.45:7777 has been idle for 10 seconds, shutting it down.
Nov 14 23:55:56 FTP02 kernel: (0,0):o2net_idle_timer:1309 here are some
times that might help debug the situation: (tmr 1163570146.656474 now
1163570156.65
5334 dr 1163570146.656446 adv 1163570146.656476:1163570146.656478 func
(3a33f0f8:505) 1163570057.403947:1163570057.403950)
Nov 14 23:55:56 FTP02 kernel: o2net: no longer connected to node
FTP01.mydomain.net (num 0) at 10.xxx.0.45:7777

I'm confused by this.  Shouldn't node 0 have eventually rebooted since it
lost network connectivity and node 1 replayed node 0's journal and kept
going?  As it is right now we are left with no IP reachable box.

If I do this same test but unplug node 1 instead of node 0, it works as it
should. node 1 will fence and node 0 will reply the journal and stay
online.

Any input is greatly appreciated.

Thanks,

Colin Farley
Network Administrator
E-Care Contact Center Services
Phone:(204) 940-6244
Fax:(204) 940-7394


Sunil Mushran <[EMAIL PROTECTED] acle.com> To [EMAIL PROTECTED] 11/13/2006 08:23 cc PM [email protected] Subject Re: [Ocfs2-users] ESX and Unbreakable 2.0 OCFS2 problem


Considering o2net only cares whether it is connected to the other node
or not, it should not make a difference whether one unplugs node 0 or
node 1.
The result should be the same. Node 1 should fence in both cases.

Do you see messages indicating that the node(s) have lost connectivity?
If so, could you share them.

It would be easiest if you could file a bug on oss.oracle.com/bugzilla with
the messages file and listing the course of events... as in, unplugged
cable
on node 0 at time x, etc.

[EMAIL PROTECTED] wrote:
I'm testing a 2 node cluster in a VMWare ESX environment for use as a
high
availability FTP server to support a CRM application.  Both nodes run
Unbreakable 2.0 x86_64.  They access a 300GB OCFS2 volume on an RDM LUN
on
an HP EVA.  All disk connectivity is fine and haven't seen any problems
there.  The problem comes when doing some IP failover testing.  The IP
failover is done using UCARP so to test failover I tried unplugging one
nodes virtual network cable to see what happens.

If I unplug node 1 everything is fine, node 1 eventually panics and
reboots
while node 0 chugs along fine.  The problem comes when unplugging node 0.
When node 0 loses network connectivity it does not panic and eventually
node 1 panics and reboots.  Is there a reason why the lower node does not
panic if it loses network connectivity?

Heartbeat thresholds are the same on each node at 31 and both nodes are
set
to reboot on panic, node0 just never panics.  All software installed are
versions that come with Unbreakable 2.0.

I didn't do the config on these boxes so the first thing I'm going to do
on
Tuesday when I work on this is rebuild both nodes from scratch but I
figured I would ask first to see if it was an easy question for someone
on
the list to answer.

Thanks,

Colin Farley
Network Administrator
E-Care Contact Center Services
Phone:(204) 940-6244
Fax:(204) 940-7394


_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users




_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to