Re: [Cluster-devel] FS/DLM module triggered kernel BUG

2021-08-24 Thread Gang He
On 2021/8/23 21:49, Alexander Aring wrote: Hi Gang He, On Mon, Aug 23, 2021 at 1:43 AM Gang He wrote: Hello Guys, I used kernel 5.13.8, I sometimes encountered the dlm module triggered kernel BUG. What do you exactly do? I would like to test it on a recent upstream version, or you can

[Cluster-devel] FS/DLM module triggered kernel BUG

2021-08-23 Thread Gang He
Hello Guys, I used kernel 5.13.8, I sometimes encountered the dlm module triggered kernel BUG. Since the dlm kernel module is not the latest source code, I am not sure if this problem is fixed, or not. The backtrace is as below, [Fri Aug 20 16:24:14 2021] dlm:

Re: [Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?

2021-08-13 Thread Gang He
Hi David, On 2021/8/13 1:45, David Teigland wrote: On Thu, Aug 12, 2021 at 01:44:53PM +0800, Gang He wrote: In fact, I can reproduce this problem stably. I want to know if this error happen is by our expectation? since there is not any extreme pressure test. Second, how should we handle

Re: [Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?

2021-08-12 Thread Gang He
Hi Alexander, On 2021/8/12 4:35, Alexander Aring wrote: Hi, On Wed, Aug 11, 2021 at 6:41 AM Gang He wrote: Hello List, I am using kernel 5.13.4 (some old version kernels have the same problem). When node A acquired a dlm (EX) lock, node B tried to get the dlm lock, node A got a BAST

[Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?

2021-08-11 Thread Gang He
Hello List, I am using kernel 5.13.4 (some old version kernels have the same problem). When node A acquired a dlm (EX) lock, node B tried to get the dlm lock, node A got a BAST message, then node A downcoverted the dlm lock to NL, dlm_lock function failed with the error -16. The function

Re: [Cluster-devel] Interest in DAX for OCFS2 and/or GFS2?

2019-10-11 Thread Gang He
Hello hayes, > -Original Message- > From: cluster-devel-boun...@redhat.com > [mailto:cluster-devel-boun...@redhat.com] On Behalf Of Hayes, Bill > Sent: 2019年10月11日 0:42 > To: ocfs2-de...@oss.oracle.com; cluster-devel@redhat.com > Cc: Rocky (The good-looking one) Craig > Subject:

[Cluster-devel] [PATCH] dlm: remove O_NONBLOCK flag in sctp_connect_to_sock

2018-05-28 Thread Gang He
mount+0x59a/0xc40 [<0>] SyS_mount+0x80/0xd0 [<0>] do_syscall_64+0x76/0x140 [<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [<0>] 0x So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can not handle non-block sockect in connect() properly. Signed-off-by: Gang

[Cluster-devel] [PATCH] dlm: fix a clerical error when set SCTP_NODELAY

2018-05-01 Thread Gang He
There is a clerical error when turn off Nagle's algorithm in sctp_connect_to_sock() function, this results in turn off Nagle's algorithm failure. After this correction, DLM performance will be improved obviously when using SCTP procotol. Signed-off-by: Gang He <g...@suse.com> Sign

[Cluster-devel] [PATCH] dlm: make sctp_connect_to_sock() return in specified time

2018-04-26 Thread Gang He
, the time is spent in sock->ops->connect() function before returns ETIMEDOUT(-110) error, since O_NONBLOCK argument in connect() function does not work here, then we should make sock->ops->connect() function return in specified time via setting socket SO_SNDTIMEO atrribute. Signed-off-by

Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-09 Thread Gang He
Hi Steven and David, >>> > Hi, > > > On 09/04/18 06:02, Gang He wrote: >> Hello David, >> >> If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf > under two-rings cluster environment, >> DLM kernel module wi

Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-08 Thread Gang He
if (sock) { 1378 add_sock(sock, con); 1379 result = 0; 1380 } 1381 else { 1382 result = -EADDRINUSE; 1383 } Thanks Gang >>> > On Mon, Apr 02, 2018 at 08:01:24PM -0600, Gang He wrote: >> OK, I got you

Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-02 Thread Gang He
Hi David, >>> > On Thu, Mar 22, 2018 at 10:27:56PM -0600, Gang He wrote: >> Hello David, >> >> Do you agree to add this prompt to the user? >> Since sometimes customers attempted to setup SCTP protocol with two rings, >> but they could not get the

Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-03-22 Thread Gang He
ut 5mins hang in case > one connection(ring) is broken. > Then, I suggest to add a kernel print, which prompts the user SCTP > protocol for DLM should be considered experimental, it is not > recommended in production environment. > > Signed-off-by: Gang He <g...@suse.com> &g

Re: [Cluster-devel] [ClusterLabs] DLM connection channel switch take too long time (> 5mins)

2018-03-08 Thread Gang He
/sle_ha/book_sleha/data/sec_ha_installatio > n_terms.html > > That fixes I saw in 4.14.* > >> On 8 Mar 2018, at 09:12, Gang He <g...@suse.com> wrote: >> >> Hi Feldhost, >> >> >>>>> >>> Hello Gang He, >>> >>

Re: [Cluster-devel] [ClusterLabs] DLM connection channel switch take too long time (> 5mins)

2018-03-08 Thread Gang He
Hi Feldhost, >>> > Hello Gang He, > > which type of corosync rrp_mode you use? Passive or Active? clvm1:/etc/corosync # cat corosync.conf | grep rrp_mode rrp_mode: passive Did you try test both? No, only this mode. Also, what kernel version you use? I s

[Cluster-devel] [PATCH v2] dlm: Make dismatch error message more clear

2017-05-17 Thread Gang He
this line error message is not enough obvious. Signed-off-by: Gang He <g...@suse.com> --- fs/dlm/lockspace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 91592b7..b03d808 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/locks

[Cluster-devel] [PATCH] dlm: Make dismatch error message more clear

2017-05-16 Thread Gang He
this line error message is not enough obvious. Signed-off-by: Gang He <g...@suse.com> --- fs/dlm/lockspace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 91592b7..b03d808 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/locks

Re: [Cluster-devel] GFS2 file system does not invalidate page cache after direct IO write

2017-05-04 Thread Gang He
Hello Andreas, >>> > Gang, > > On Thu, May 4, 2017 at 5:33 AM, Gang He <g...@suse.com> wrote: >> Hello Guys, >> >> I found a interesting thing on GFS2 file system, After I did a direct IO > write for a whole file, I still saw there were some page c

[Cluster-devel] GFS2 file system does not invalidate page cache after direct IO write

2017-05-03 Thread Gang He
Hello Guys, I found a interesting thing on GFS2 file system, After I did a direct IO write for a whole file, I still saw there were some page caches in this inode. It looks this GFS2 behavior does not follow file system POSIX semantics, I just want to know this problem belongs to a know issue

[Cluster-devel] inconsistent dlm_new_lockspace LVB_LEN size from ocfs2 user-space tool and ocfs2 kernel module

2016-05-13 Thread Gang He
Hello Guys, Here is a inconsistent LVB_LEN size problem when create a new lockspace from user-space tool (e.g. fsck.ocfs2) and kernel module (e.g. ocfs2/stack_user.c). >From the userspace tool, the LVB size is DLM_USER_LVB_LEN (32 bytes, defined >in /include/linux/dlm_device.h) >From the kernel