Upgrade to OCFS2 1.2.9-1 shipping with the latest SLES9 SP4 kernel (2.6.5-7.312). http://download.novell.com/Download?buildid=27kCZ1qWwWo~
You are most likely hitting bug#6680001 as mentioned here. http://oss.oracle.com/projects/ocfs2/news/article_17.html Also, you might want to tone down the heartbeat threshold from 10 mins to 1 or 2 mins. http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#TIMEOUT If you still have problems, log a bug with Novell. We may need more details. Sunil Kuang, Howard [WHQKT] wrote: > > Hi, Sunil or Tao, > > > > I have a 4 nodes OCFS2 cluster running OCFS2 1.2.8 on SuSE 9 SP4. When > I tried to do failover testing (shutting down one node), the whole > cluster hung (I can not even login to any server in the cluster). I > have to bring all of them up and then be able to use the system. What > kind of behavior is it? Is it the fence of OCFS2? Below is my > configuration. > > > > aopcer13:~ # /etc/init.d/o2cb status > > Module "configfs": Loaded > > Filesystem "configfs": Mounted > > Module "ocfs2_nodemanager": Loaded > > Module "ocfs2_dlm": Loaded > > Module "ocfs2_dlmfs": Loaded > > Filesystem "ocfs2_dlmfs": Mounted > > Checking O2CB cluster ERP_UAT_APPS_OCFS2: Online > > Heartbeat dead threshold: 300 > > Network idle timeout: 60000 > > Network keepalive delay: 2000 > > Network reconnect delay: 2000 > > Checking O2CB heartbeat: Active > > > > Please help. If you need more information, please let me know. > > > > Regards, > > > > > > > > */Howard(Kuiyang) Kuang/* > > > > *Principal Systems Engineer*** > > *Linux Engineering - WHQKT* > > *United Airlines*** > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
