[Cluster-devel] [PATCH] dlm: make dlm_posix_lock comply with posix file lock semanteme

2015-10-14 Thread Eric Ren
t_killable with wait_event_interruptible can fix this issue. Signed-off-by: Eric Ren <z...@suse.com> Acked-by: David Teigland <teigl...@redhat.com> --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 5532f09..88f1036 10064

[Cluster-devel] [PATCH] dlm: make dlm_posix_lock comply with posix file lock semanteme

2015-10-14 Thread Eric Ren
t_killable with wait_event_interruptible can fix this issue. Signed-off-by: Eric Ren <z...@suse.com> Acked-by: David Teigland <teigl...@redhat.com> --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 5532f09..88f1036 10064

[Cluster-devel] dlm: make dlm_posix_lock comply with posix file lock semanteme

2015-10-14 Thread Eric Ren
Hi David and all, After using this patch, deadlock test for posix file lock, both on local and on cluster mode for ocfs2, acts the same now as other local fs like ext4. But, I have no gfs2 environment to verify if this issue would happen and can be fixed by this patch. Hope anyone working on

Re: [Cluster-devel] problem about dlm posix file lock (sorry for missing subject)

2015-10-13 Thread Eric Ren
Hi Divad, Please see comments in lines;-) David Teigland write: > >> On Tue, Oct 13, 2015 at 04:30:53AM -0600, Zhen Ren wrote: >> It expects alarm timeout to send SIGALRM, and wake up the sleep process, >> as "man fcntl" says: "If a signal is caught while waiting,

Re: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

2016-05-18 Thread Eric Ren
like every node own this volume; so corruption may happen? Thanks a lot, Eric On 05/17/2016 08:10 PM, Eric Ren wrote: Hi David, This is just a draft patch for you to review;-) There's an issue I'm not sure: where should we clear "stateful_merge_wait"? And I need more comm

Re: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

2016-05-20 Thread Eric Ren
Hi David, On 05/19/2016 02:50 AM, David Teigland wrote: On Wed, May 18, 2016 at 02:53:00PM +0800, Eric Ren wrote: Q1: what's stateful merged node? Q2: what if we add the stateful merged nodes to dlm_controld daemon cpg instead of fencing them? The details here are fundamental to the way

[Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful merging

2016-05-20 Thread Eric Ren
uot; becomes true. David advised me to do the right thing;-) Thanks a lot! Signed-off-by: Eric Ren <z...@suse.com> --- dlm_controld/daemon_cpg.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/dlm_controld/daemon_cpg.c b/dlm_controld/daemon_cpg.c index 356

[Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful merging

2016-05-17 Thread Eric Ren
uot; becomes true. David advised me to do the right thing;-) Thanks a lot! Signed-off-by: Eric Ren <z...@suse.com> --- dlm_controld/daemon_cpg.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/dlm_controld/daemon_cpg.c b/dlm_controld/daemon_cpg.c index 356

[Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

2016-05-17 Thread Eric Ren
Hi David, This is just a draft patch for you to review;-) There's an issue I'm not sure: where should we clear "stateful_merge_wait"? And I need more communications with pacemaker guys and more time for testing. I will send you the formal patch if things get done;-) Thanks, Eric

Re: [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick

2016-05-17 Thread Eric Ren
Hello David: On 05/17/2016 01:12 AM, David Teigland wrote: This looks good. Would you still use this patch if we add the new dlm_tool output from the other email? Please hold back this for now;-) I prefer to drop this method if the latter one works better. And I'm trying to working this out

[Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick

2016-05-16 Thread Eric Ren
merged partitions are kicking the other out of the cluster at the same time. Signed-off-by: Eric Ren <z...@suse.com> --- dlm_controld/daemon_cpg.c | 6 +- dlm_controld/dlm.conf.5 | 2 ++ dlm_controld/dlm_controld.8 | 5 + dlm_controld/dlm_daemon.h | 1 + dlm_controld/main.c

Re: [Cluster-devel] [DLM PATCH] dlm_controld: handle the case of network transient disconnection

2016-05-16 Thread Eric Ren
Hi David, On 05/13/2016 11:49 PM, David Teigland wrote: If both sides of the merged partition are kicking the other out of the cluster at the same time, it's hard to predict which nodes will remain (and it could be none). To resolve an even partition merge, you need to remove/restart the nodes

Re: [Cluster-devel] [DLM PATCH] dlm_controld: handle the case of network transient disconnection

2016-05-12 Thread Eric Ren
Hi David, Thanks very very much for explaining this to me in such nice way;-) On 05/13/2016 12:51 AM, David Teigland wrote: T = time in seconds, A,B,C = cluster nodes. At T=1 A,B,C become members and have quorum. At T=10 a partition creates A,B | C. At T=11 it merges and creates A,B,C. At

[Cluster-devel] [DLM PATCH] dlm_controld: handle the case of network transient disconnection

2016-05-12 Thread Eric Ren
me". We now skip this chance of telling corosync to kill cluster for stateful merge. As a result, any fencing cannot proceed further. Signed-off-by: Eric Ren <z...@suse.com> --- dlm_controld/daemon_cpg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dlm_controld/daemon_

Re: [Cluster-devel] [PATCH -next] dlm: fix error return code in sctp_accept_from_sock()

2016-10-26 Thread Eric Ren
Hi, On 10/25/2016 01:03 PM, weiyongjun (A) wrote: Hi Eric, -Original Message- From: Eric Ren [mailto:z...@suse.com] Sent: Tuesday, October 25, 2016 10:52 AM To: Wei Yongjun <weiyj...@gmail.com>; Christine Caulfield <ccaul...@redhat.com>; David Teigland <teigl...@redha

Re: [Cluster-devel] [PATCH -next] dlm: fix error return code in sctp_accept_from_sock()

2016-10-24 Thread Eric Ren
Hi, Coding style patch is always rejected;-) Eric On 10/22/2016 10:37 PM, Wei Yongjun wrote: From: Wei Yongjun Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun

Re: [Cluster-devel] Question on LVB when the node that held EX lock crash

2016-11-17 Thread Eric Ren
Hi! On 11/16/2016 11:08 PM, David Teigland wrote: convert(R1, EX) get LVB Qustion: what is the LVB then? x or y? == Is this a valid question? or am I missing something? It's a good question, and it's been enough years that the details are now hazy. I think the current behavior

Re: [Cluster-devel] Question on LVB when the node that held EX lock crash

2016-11-16 Thread Eric Ren
On 11/16/2016 04:29 PM, Eric Ren wrote: Hi David and all, I am debugging an issue of ocfs2 that relates to LVB value. I will try to make it a pure DLM question: Two nodes (N1, N2) try to truncate the same file(R1) concurrently. N1 N2 lock(R1, EX

[Cluster-devel] Question on LVB when the node that held EX lock crash

2016-11-16 Thread Eric Ren
Hi David and all, I am debugging an issue of ocfs2 that relates to LVB value. I will try to make it a pure DLM question: Two nodes (N1, N2) try to truncate the same file(R1) concurrently. N1 N2 lock(R1, EX) changing LVB: x

Re: [Cluster-devel] Question on LVB when the node that held EX lock crash

2016-11-30 Thread Eric Ren
Hi David, On 11/16/2016 11:08 PM, David Teigland wrote: convert(R1, EX) get LVB Qustion: what is the LVB then? x or y? == Is this a valid question? or am I missing something? It's a good question, and it's been enough years that the details are now hazy. I think the current behavior

Re: [Cluster-devel] Question on LVB when the node that held EX lock crash

2016-12-01 Thread Eric Ren
Hi David, On 12/01/2016 12:16 AM, David Teigland wrote: On Wed, Nov 30, 2016 at 05:07:22PM +0800, Eric Ren wrote: a. Should we put recover_lvb() even before recover_conversion()? if not, why? Yes, I think you're right. The lvb decision should be made using the original lock modes