AFAIK: a. There is no force umount in Linux. b. There is no way to know whether a local fs is mounted on another node.
Luis Freitas wrote: > Alexandra, > > You could use only CRS and ext3 instead of ocfs2 for this kind of > use. You would need to register a script to force umount the > filesystem on the primary node and mount it on the node you are > failing over to, it would be nice to be able to check if the > filesystem is mounted before atempting to mount it, but I am not sure > on how to do this) > > Are you using a cross-over cable for the private interconnect? > > Regards, > Luis > > --- On *Fri, 6/27/08, [EMAIL PROTECTED] > /<[EMAIL PROTECTED]>/* wrote: > > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Subject: [Ocfs2-users] CRS/CSS and OCFS2 > To: [email protected] > Date: Friday, June 27, 2008, 10:41 AM > > > Hello, > > I refer to you hoping you may help me with my problem... We have > got an issur here and opened a SR at Metalink but until now, we > got no useful information in solving our problem. SR-Number is > 6855815.994... > > We wanted to protect 9i Single-Instance Databases with 10g > Clusterware following the third-party-tool approach. There are no > RAC-databases involved. But we want to achieve high availability > as the databases are business critical systems. We want to make > the systems able to > relocate to another machine in case of failure to keep downtimes > low... To achieve this we want to use OCFS2 for the filesystem. > Relocate is done by script with help of CRS. > > So we took two systems (byaz05 and byaz10) and installed the > following software: 10g CRS (10.2.0.3) and Oracle Software 9.2.0.8 > and OCFS2 1.2.8 > > We found the following Metalinknotes and adjusted the heartbeat > and timeouts for OCFS2: Metalink Note 395878.1: > Heartbeat/Voting/Quorum Related Timeout Configuration for Linux, > OCFS2, RAC Stack to avoid unnessary node fencing, panic and reboot > Metalink Note 391771.1: OCFS2 - FREQUENTLY ASKED QUESTIONS (hier > insbesondere der Abschnitt zu Fencing und Quorum) > Metalink Note 434255.1: Common reasons for OCFS2 Kernel Panic or > Reboot Issues > Metalink Note 457423.1: OCFS2 Fencing, Network, and Disk Heartbeat > Timeout Configuration > > We did no changes to the CRS/CSS default settings until now. > > During HA-testing we watched unexpected behaviour of the system. > We deactivated the bond for private interconnect and expected only > one node to go down. But we faced both nodes going down. As it > seems to me one node was rebooted from OCFS2 and the other one > from CRS/CSS. > > Timestamp > > -------------------------------------------------------------------------------------------------------------- > > 10:21:06 bond1 disabled (eth1) > */var/log/messages byaz05* > Apr 25 10:21:06 byaz05 kernel: bonding: bond1: link status > definitely down for interface eth1, disabling it > Apr 25 10:21:06 byaz05 kernel: bonding: bond1: making interface > eth5 the new active one. > > 10:21:09 bond1 disabled (eth5) > */var/log/messages byaz05* > Apr 25 10:21:09 byaz05 kernel: bonding: bond1: link status > definitely down for interface eth5, disabling it > Apr 25 10:21:09 byaz05 kernel: bonding: bond1: now running without > any active interface ! > > 10:21:23 o2net – no longer connected > */var/log/messages byaz05* > Apr 25 10:21:23 byaz05 kernel: o2net: no longer connected to node > byaz10.bayer-ag.com (num 1) at 10.190.59.6:7777 > */var/log/messages byaz10* > Apr 25 10:21:23 byaz10 kernel: o2net: no longer connected to node > byaz05.bayer-ag.com (num 0) at 10.190.59.5:7777 > > 10:21:27 CSSD failure 134 > 10:21:29 Reboot initiated by CRS > */var/log/messages byaz05* > Apr 25 10:21:27 byaz05 logger: Oracle clsomon failed with fatal > status 12. > Apr 25 10:21:27 byaz05 logger: Oracle CSSD failure 134. > Apr 25 10:21:27 byaz05 su(pam_unix)[25839]: session closed for > user oracle > Apr 25 10:21:27 byaz05 logger: Oracle CRS failure. Rebooting for > cluster integrity. > Apr 25 10:21:27 byaz05 kernel: md: stopping all md devices. > Apr 25 10:21:27 byaz05 kernel: md: md0 switched to read-only mode. > Apr 25 10:21:29 byaz05 logger: Oracle CRS failure. Rebooting for > cluster integrity. > Apr 25 10:21:29 byaz05 kernel: e1000: eth2: e1000_watchdog_task: > NIC Link is Up 1000 Mbps Full Duplex > Apr 25 10:21:29 byaz05 logger: Oracle init script ceding reboot to > sibling 27383. > > 10:21:58 Reboot initiated by OCFS2(?) > */var/log/messages byaz10* > Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session opened for user > oracle by (uid=0) > Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session closed for user > oracle > Apr 25 10:25:58 byaz10 syslogd 1.4.1: restart. > Apr 25 10:25:58 byaz10 syslog: syslogd startup succeeded > Apr 25 10:25:58 byaz10 kernel: klogd 1.4.1, log source = > /proc/kmsg started. > Apr 25 10:25:58 byaz10 kernel: Bootdata ok (command line is ro > root=/dev/vgroot/_) > > > We supposed all the time this is a timing problem. But we don't > know which settings raise the problem and which steps to do to to > correct them. Otherwise we'll have to work over the complete > concept for the business critical systems. > Can anyone help me? > > Regards, > Alexandra > > > > > > > > > > > > > > Freundliche Grüße / Best Regards > > Alexandra Strauss > _________________________________________ > > Fa. Opitz Consulting > Fa. Opitz Consulting > Phone: > Fax: > E-mail: > Web: http://www.BayerBBS.com > > Geschäftsführung: Vorsitzender Andreas Resch | Arbeitsdirektor > Norbert Fieseler > Vorsitzender des Aufsichtsrats: Klaus Kühn > Sitz der Gesellschaft: Leverkusen | Amtsgericht Köln, HRB 49895 > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
