On Mon, Sep 20, 2010 at 5:55 PM, Alain.Moulle <[email protected]> wrote: > Hi > > I have a "philosophic" question about two nodes with FS under OCFS2 > and Pacemaker/corosync for the HA of both nodes. > > My choice was to let OCFS2 stack out of Pacemaker configuration, > so I let the services o2cb and ocfs2 started at boot time.
Terrible idea sorry. You _really_ dont want two parts of the same cluster using different membership information. > > And I only configure some FS of type OCFS2 as clone resources in Pacemaker, > and some other resources have collocation on these clones FS OCFS2. > > It seemed to me that it should work , and that I don't have to set the > management > of OCFS2 "in" Pacemaker , with the pcmk stack instead of the o2cb stack. > > And it works fine ... except if I kill one node (by fence, or reboot -f) > , then I have > dlm errors on remaining nodes, and some clone FS OCFS2 become failed > and of course collocated resources are stopped. > Errors are likewise : > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2191,39):dlm_drop_lockres_ref:2210 ERROR: status = -112 > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2191,39):dlm_purge_lockres:205 ERROR: status = -112 > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2191,39):dlm_drop_lockres_ref:2210 ERROR: status = -107 > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2191,39):dlm_purge_lockres:205 ERROR: status = -107 > 1284997247 2010 Sep 20 17:40:47 node0 kern info kernel ocfs2: Unmounting > device (8,112) on (node 1) > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2508,3):dlm_do_master_request:1333 ERROR: link to 2 went down! > 1284997247 2010 Sep 20 17:40:47 node0 kern err kernel > (2508,3):dlm_get_lock_resource:916 ERROR: status = -107 > 1284997247 2010 Sep 20 17:40:47 node0 syslog err syslog-ng Initiating > connection failed, reconnecting; time_reopen='10' > 1284997267 2010 Sep 20 17:41:07 node0 syslog err syslog-ng Error > resolving hostname; host='syslog-server' > 1284997267 2010 Sep 20 17:41:07 node0 kern err kernel > (22095,6):dlm_send_proxy_ast_msg:456 ERROR: status = -107 > 1284997267 2010 Sep 20 17:41:07 node0 kern err kernel > (22095,6):dlm_flush_asts:603 ERROR: status = -107 > 1284997267 2010 Sep 20 17:41:07 node0 syslog err syslog-ng Initiating > connection failed, reconnecting; time_reopen='10' > etc. > > And in fact, I have this type of errors even /without/ Pacemaker started > on any node when I also kill one node. > > So dlm/ocfs2 errors in syslog seem "normal" , but my clone-fs in > Pacemaker do not "take them as normal" as some become "Failed" for > "unknown error" . > > So my question is : > is my configuration expected to work ? > (and if so, how could I workaround this problem ?) > or > is pcmk stack really mandatory when we have ocfs2 and Pacemaker > together on two nodes ? > > Thanks for your responses. > Alain > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
