[Linux-HA] About OCFS2 and Pacemaker

Alain.Moulle Mon, 20 Sep 2010 08:59:54 -0700

Hi

I have a "philosophic" question about two nodes with FS under OCFS2
and Pacemaker/corosync for the HA of both nodes.


My choice was to let OCFS2 stack out of Pacemaker configuration,
so I let the services o2cb and ocfs2 started at boot time.

And I only configure some FS of type OCFS2 as clone resources in Pacemaker,
and some other resources have collocation on these clones FS OCFS2.

It seemed to me that it should work , and that I don't have to set the 
management
of OCFS2 "in" Pacemaker , with the pcmk stack instead of the o2cb stack.

And it works fine ... except if I kill one node (by fence, or reboot -f) 
, then I have
dlm errors on remaining nodes, and some clone FS OCFS2 become failed
and of course collocated resources are stopped.
Errors are likewise :
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2191,39):dlm_drop_lockres_ref:2210 ERROR: status = -112
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2191,39):dlm_purge_lockres:205 ERROR: status = -112
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2191,39):dlm_drop_lockres_ref:2210 ERROR: status = -107
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2191,39):dlm_purge_lockres:205 ERROR: status = -107
1284997247 2010 Sep 20 17:40:47 node0 kern info kernel ocfs2: Unmounting 
device (8,112) on (node 1)
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2508,3):dlm_do_master_request:1333 ERROR: link to 2 went down!
1284997247 2010 Sep 20 17:40:47 node0 kern err kernel 
(2508,3):dlm_get_lock_resource:916 ERROR: status = -107
1284997247 2010 Sep 20 17:40:47 node0 syslog err syslog-ng Initiating 
connection failed, reconnecting; time_reopen='10'
1284997267 2010 Sep 20 17:41:07 node0 syslog err syslog-ng Error 
resolving hostname; host='syslog-server'
1284997267 2010 Sep 20 17:41:07 node0 kern err kernel 
(22095,6):dlm_send_proxy_ast_msg:456 ERROR: status = -107
1284997267 2010 Sep 20 17:41:07 node0 kern err kernel 
(22095,6):dlm_flush_asts:603 ERROR: status = -107
1284997267 2010 Sep 20 17:41:07 node0 syslog err syslog-ng Initiating 
connection failed, reconnecting; time_reopen='10'
etc.

And in fact, I have this type of errors even /without/ Pacemaker started 
on any node when I also kill one node.

So dlm/ocfs2 errors in syslog seem "normal" , but my clone-fs in 
Pacemaker do not "take them as normal" as some become "Failed" for
"unknown error" .

So my question is :
   is my configuration expected to work ?
   (and if so, how could I workaround this problem ?)
or
   is pcmk stack really mandatory when we have ocfs2 and Pacemaker 
together on two nodes ?

Thanks for your responses.
Alain
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] About OCFS2 and Pacemaker

Reply via email to