Re: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

Eric Ren Tue, 17 May 2016 23:54:13 -0700

Hi David,

Ken Gaillot got me with this question:
Since corosync/pcmk can be healed from such a case, why not DLM?
Please look at detailed discussion here:
       [1] https://github.com/ClusterLabs/pacemaker/pull/839


Here is my thoughts, but I'm not sure, CMIIW please:

time: T; cluster:A, B, C; and if we have a lockspace named after $uuidfor a shared disk volume, and a CPG for lockspace $uuid; $uuid CPG has

members of A, B and C when things are OK, but:

T: quorum lost; cluster partitions into 3 parts; lockspace $uuid cannotperform any lockspace operations because cluster is not quorate;

T+1: quorum regained; dlm_controld daemon CPG has not done itsmerging/fencing stuff; so here are 2 questions:

Q1: what's stateful merged node?

I've seen the comments within code;-) It means a lockspace has been onthe node before it sends protocol message?

Q2: what if we add the stateful merged nodes to dlm_controld daemon cpginstead of fencing them?

if so, CPG $uuid now, e.g. from the perspective of A, may has only onememeber - A itself, it can perform lockspace now because cluster isquorate now (and if we skip fencing); B and C do likewise; then for eachnode, it looks like every node own this volume; so corruption may happen?


Thanks a lot,
Eric

On 05/17/2016 08:10 PM, Eric Ren wrote:

Hi David,
This is just a draft patch for you to review;-) There's an issue I'm
not sure: where should we clear "stateful_merge_wait"?

And I need more communications with pacemaker guys and more time for testing.
I will send you the formal patch if things get done;-)

Re: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

Reply via email to