Alexei,
Actually your log seems to show that CSSD (Oracle CRS) rebooted the node
before OCFS2 got a chance to do it.
On a RAC cluster, if the interconnect is interrupted, all the nodes hang
until a split brain resolution is complete and the recovery of all the crashed
nodes is completed. This is needed because every read on a Oracle datablock
needs a ping to the other nodes.
The view of the data must be consistent, when one node read a particular
data block, the Oracle Database first ping the other nodes to ensure that they
did not modify the block and still have not flushed it to disk. Another node
may even forward a reply with the block, preventing the disk access (Cache
Fusion).
When a split brain occurs, there is the loss of these blocks not flushed
to disk, and they are rebuilt using the redo threads of the particular nodes
that crashed. During this interval all the database instances "freeze", since
before the node recovery is complete there is no way to guarantee that a block
read from disk has not been altered on the crashed node.
So the fencing is needed even if there is no disk activity, as the entire
cluster becomes "hang" the moment the interconnect is down. And the timeout for
the fencing must be as small as possible to prevent a long cluster
reconfiguration delay. Of course the timeout must be tuned so as to be larger
than ethernet switch failovers, or storage controller or disk multipath
failovers. Or if possible the failover times should be reduced.
Now, on the other hand, I am too having problems with OCFS2. It seems much
less robust than ASM and the previous version, OCFS, specially under heavy disk
activity. But I do expect these problems to get solved in the near future, as
did the 2.4 kernel VM problems.
Regards,
Luis
Alexei_Roudnev <[EMAIL PROTECTED]> wrote:
Additional info - node had not ANY active OCFSv2 operations (OCFSv2
used for backups only and from another node only). So, if system just SUSPEND
all FS operations and try to rejoin to the cluster, it all could work
(moreover, connection to the disk system was intact, so it could close file
sytem gracefully).
It reveals 3 problems at once:
- single heartbeat link (instead of multiple links)
- timeout too short (ethernet can't guarantee 10 seconds, it can guarantee 1
minute minimum);
- fencing even if system is passive and can remount / reconnect instead of
rebooting.
All we did in the lab was _disconnect 1 of trunks between switches for a few
seconds, then insert it back into the socket_. No one other application failed
(including heartbeat clusters). Database cluster was not doing anything on
OCFS in time of failure (even backups).
I will try heartbeat between loopback interfaces (and OCFS protocol) next
time (I am just curios if it can provide 10 seconds for network
reconfiguration).
...
Feb 1 12:19:13 testrac12 kernel: o2net: connection to node testrac11 (num 0)
at 10.254.32.111:7777 has been idle for 10 seconds, shutting it down.
Feb 1 12:19:13 testrac12 kernel: (13,3):o2net_idle_timer:1310 here are some
times that might help debug the situation: (tmr 1170361135.521061 now
1170361145.520476 dr 1170361141.852795 adv 1170361135.521063:1170361135.521064
func (c4378452:505) 1170361067.762941:1170361067.762967)
Feb 1 12:19:13 testrac12 kernel: o2net: no longer connected to node testrac11
(num 0) at 10.254.32.111:7777
Feb 1 12:19:13 testrac12 kernel: (1855,3):dlm_send_remote_convert_request:398
ERROR: status = -107
Feb 1 12:19:13 testrac12 kernel: (1855,3):dlm_wait_for_node_death:371
5AECFF0BBCF74F069A3B8FF79F09FB5A: waiting 5000ms for notification of death of
node 0
Feb 1 12:19:13 testrac12 kernel: (1855,1):dlm_send_remote_convert_request:398
ERROR: status = -107
Feb 1 12:19:13 testrac12 kernel: (1855,1):dlm_wait_for_node_death:371
5AECFF0BBCF74F069A3B8FF79F09FB5A: waiting 5000ms for notification of death of
node 0
Feb 1 12:22:22 testrac12 kernel: (1855,2):dlm_send_remote_convert_request:398
ERROR: status = -107
Feb 1 12:22:22 testrac12 kernel: (1855,2):dlm_wait_for_node_death:371
5AECFF0BBCF74F069A3B8FF79F09FB5A: waiting 5000ms for notification of death of
node 0
Feb 1 12:22:27 testrac12 kernel: (13,3):o2quo_make_decision:144 ERROR: fencing
this node because it is connected to a half-quorum of 1 out of 2 nodes which
doesn't include the lowest active node 0
Feb 1 12:22:27 testrac12 kernel: (13,3):o2hb_stop_all_regions:1889 ERROR:
stopping heartbeat on all active regions.
Feb 1 12:22:27 testrac12 kernel: Kernel panic: ocfs2 is very sorry to be
fencing this system by panicing
Feb 1 12:22:27 testrac12 kernel:
Feb 1 12:22:28 testrac12 su: pam_unix2: session finished for user oracle,
service su
Feb 1 12:22:29 testrac12 logger: Oracle CSSD failure. Rebooting for cluster
integrity.
Feb 1 12:22:32 testrac12 su: pam_unix2: session finished for user oracle,
service su
...
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users
---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users