[Ocfs2-devel] [RFC] Should we revert commit "ocfs2: do not lock/unlock() inode DLM lock"

2017-02-09 Thread Changwei Ge
Hi, We encountered a LIVELOCK problem while performing a test case of high-concurrency read between different nodes. It's very easy to reproduce this issue. Just perform high-concurrency read operation against one single large file (say named test_file) from one physical node, meanwhile, perform

[Ocfs2-devel] [PATCH v3] ocfs2/journal: fix umount hang after flushing journal failure

2017-01-12 Thread Changwei Ge
: Changwei Ge <ge.chang...@h3c.com> Date: Wed, 11 Jan 2017 09:05:35 +0800 Subject: [PATCH] fix umount hang after journal flushing failure Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/journal.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/fs/ocfs2/jo

Re: [Ocfs2-devel] [Bug Report] crash in the path of direct IO

2017-04-10 Thread Changwei Ge
Hi Gang, More log before crash is pasted. Also, I gathered some structure's content, which may be useful to analyze this issue. Paste them here too. struct dio_submit { bio = 0x88035690b800, blkbits = 0x9, blkfactor = 0x3, start_zero_done = 0x1, pages_in_io = 0x34,

[Ocfs2-devel] [Bug Report] crash in the path of direct IO

2017-04-10 Thread Changwei Ge
Hi, We encountered a crash issue days ago. The call trace follows as below: >From the call trace, we can see that a direct read request caused this crash issue, which triggered a BUG_ON check point. With the help of debugfs.ocfs2 tool, I can see that clusters owned by the target file are

Re: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-08-10 Thread Changwei Ge
Hi Joseph, On 2017/8/10 17:53, Joseph Qi wrote: > Hi Changwei, > > On 17/8/9 23:24, ge changwei wrote: >> Hi >> >> >> On 2017/8/9 下午7:32, Joseph Qi wrote: >>> Hi, >>> >>> On 17/8/7 15:13, Changwei Ge wrote: >>>> Hi, >

Re: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-08-08 Thread Changwei Ge
On 2017/8/8 4:20, Mark Fasheh wrote: > On Mon, Aug 7, 2017 at 2:13 AM, Changwei Ge <ge.chang...@h3c.com> wrote: >> Hi, >> >> In current code, while flushing AST, we don't handle an exception that >> sending AST or BAST is failed. >> But it is indeed possib

[Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-08-07 Thread Changwei Ge
. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmrecovery.c | 51 ++-- fs/ocfs2/dlm/dlmthread.c | 39 +++-- 2 files changed, 81 insertions(+), 9 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.

Re: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-08-07 Thread Changwei Ge
> And the re-queuing AST or BAST will be dropped if the requesting node is >> dead! >> >> It will improve the reliability a lot. >> >> >> Thanks. >> >> Changwei. >> >> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> >> --- >> fs

Re: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-08-23 Thread Changwei Ge
On 2017/8/23 12:48, Gang He wrote: > > >> On 17/8/23 10:23, Junxiao Bi wrote: >>> On 08/10/2017 06:49 PM, Changwei Ge wrote: >>>> Hi Joseph, >>>> >>>> >>>> On 2017/8/10 17:53, Joseph Qi wrote: >>

Re: [Ocfs2-devel] Mixed mounts w/ different physical block sizes (long post)

2017-09-19 Thread Changwei Ge
On 2017/9/19 14:47, Michael Ulbrich wrote: > Hi Changwei, > > thanks for looking into this! > > On 19/09/17 05:32, Changwei Ge wrote: > >> Could you please also provide information about *slot_map*, just type >> "slotmap" in debugfs.ocfs2 tool. Th

Re: [Ocfs2-devel] Mixed mounts w/ different physical block sizes (long post)

2017-09-18 Thread Changwei Ge
Hi Michael, On 2017/9/18 23:45, Michael Ulbrich wrote: > Hi again, > > chatting with a helpful person on #ocfs2 IRC channel this morning I got > encouraged to cross-post to ocsf2-devel. For historic background and > further details pls. see my two previous posts to ocfs2-users from last > week

[Ocfs2-devel] [PATCH] ocfs2: fix cluster hang after a node dies

2017-10-17 Thread Changwei Ge
...@gmail.com> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmrecovery.c |1 + 1 file changed, 1 insertion(+) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 74407c6..ec8f758 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dl

Re: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

2017-09-13 Thread Changwei Ge
/23 10:23, Junxiao Bi wrote: >>>> On 08/10/2017 06:49 PM, Changwei Ge wrote: >>>>> Hi Joseph, >>>>> >>>>> >>>>> On 2017/8/10 17:53, Joseph Qi wrote: >>>>>> Hi Changwei, >>>>>> >>>>>

Re: [Ocfs2-devel] [PATCH] ocfs2: remove unused function ocfs2_publish_get_mount_state.

2017-09-28 Thread Changwei Ge
Acked-by: Changwei Ge <ge.chang...@h3c.com> On 2017/9/28 19:16, Guozhonghua wrote: > > Remove unused function ocfs2_publish_get_mount_state. > > Signed-off-by: guozhonghua <guozhong...@h3c.com> > --- > fs/ocfs2/super.h |3 --- > 1 file changed, 3 deleti

[Ocfs2-devel] PATCH] ocfs2/cluster: trim a trailing white space

2017-09-01 Thread Changwei Ge
Hi, We'd better not add a white space at the tail of a line, so trim it! Thanks, Changwei Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/cluster/tcp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) mode change 100644 => 100755 fs/ocfs2/cluster/tcp.c d

Re: [Ocfs2-devel] ocfs2-test supports Python 3

2017-09-27 Thread Changwei Ge
On 2017/9/27 14:40, Gang He wrote: > ** PRIVATE ** > > Hello guys, > > As you know, some Linux distributions(e.g. SUSE Enterprise Linux 15) will > introduce Python3 as the default, > our Python scripts in ocfs2-test still use Python2, we will have to do proper > modifications to migration to

Re: [Ocfs2-devel] mmotm 2016-08-02-15-53 uploaded

2017-10-10 Thread Changwei Ge
Hi Andrew and Vitaly, I do agree that patch ee8f7fcbe638 ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down", 2016-08-02) introduced an issue. It makes DLM recovery can't pick up a new master for an existed lock resource whose owner died seconds ago. But this patch

Re: [Ocfs2-devel] [PATCH] ocfs2: fix cluster hang after a node dies

2017-10-18 Thread Changwei Ge
n provide some better or more detailed clues. Thanks, Changwei. > > On 2017/10/17 14:48, Changwei Ge wrote: >> When a node dies, other live nodes have to choose a new master >> for an existed lock resource mastered by the dead node. >> >> As for ocfs2/dlm i

Re: [Ocfs2-devel] [patch] ocfs2: fix qs_holds may could not be zero

2017-10-17 Thread Changwei Ge
On 2017/10/18 7:21, Andrew Morton wrote: > On Thu, 21 Sep 2017 02:09:33 + Zhangyang wrote: > >> In our test, We fond that , when the network down, qs->qs_holds could not be >> reduce to zero, it will lead to the node can't do fence. >> >> >> >> o2net_idle_timer ->

[Ocfs2-devel] [PATCH] ocfs2/dlm: clean dead code up

2017-11-23 Thread Changwei Ge
This code snippet is no longer used. So trim it, thus to make code neat. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmmaster.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 3e04279446e8..93b61c

Re: [Ocfs2-devel] [PATCH] Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!

2017-11-23 Thread Changwei Ge
Hi Alex, I just reviewed your patch and a few questions were come up with. On 2017/11/24 13:49, alex chen wrote: > Hi John, > > I think a better method to solve this problem. > > On 2017/11/22 5:05, John Lightsey wrote: >> On Tue, 2017-11-21 at 05:58 +, Changwei Ge w

Re: [Ocfs2-devel] [PATCH v2 2/3] ocfs2: add ocfs2_overwrite_io function

2017-11-29 Thread Changwei Ge
It looks fine to me. On 2017/11/29 16:39, Gang He wrote: > Add ocfs2_overwrite_io function, which is used to judge if > overwrite allocated blocks, otherwise, the write will bring extra > block allocation overhead. > > Signed-off-by: Gang He <g...@suse.com> Reviewed-by:

Re: [Ocfs2-devel] [PATCH v2 1/3] ocfs2: add ocfs2_try_rw_lock and ocfs2_try_inode_lock

2017-11-29 Thread Changwei Ge
On 2017/11/29 16:38, Gang He wrote: > Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which > will be used in non-block IO scenarios. > > Signed-off-by: Gang He > --- > fs/ocfs2/dlmglue.c | 21 + > fs/ocfs2/dlmglue.h | 4 > 2 files changed,

Re: [Ocfs2-devel] [PATCH v2 1/3] ocfs2: add ocfs2_try_rw_lock and ocfs2_try_inode_lock

2017-11-29 Thread Changwei Ge
Hi Gang, On 2017/11/30 10:45, Gang He wrote: > Hello Changwei, > > >> On 2017/11/29 16:38, Gang He wrote: >>> Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which >>> will be used in non-block IO scenarios. >>> >>> Signed-off-by: Gang He >>> --- >>>

Re: [Ocfs2-devel] [patch 08/11] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-11-30 Thread Changwei Ge
I NACK to this patch since I don't think it can solve the issue Jun reported completely. Thanks, Changwei On 2017/12/1 6:26, a...@linux-foundation.org wrote: > From: piaojun > Subject: ocfs2/dlm: wait for dlm recovery done when migrating all lockres > > wait for dlm

[Ocfs2-devel] [PATCH resend] renew changelog and title Re: [patch 07/11] ocfs2: fix qs_holds may could not be zero

2017-11-30 Thread Changwei Ge
.ya...@h3c.com> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/cluster/quorum.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/cluster/quorum.c b/fs/ocfs2/cluster/quorum.c index 62e8ec619b4c..af2e7473956e 100644 --- a/fs/ocfs2/cluster/quoru

Re: [Ocfs2-devel] [patch 01/11] ocfs2: remove ocfs2_is_o2cb_active()

2017-11-30 Thread Changwei Ge
Acked-by: Changwei Ge <ge.chang...@h3c.com> On 2017/12/1 6:25, a...@linux-foundation.org wrote: > From: Gang He <g...@suse.com> > Subject: ocfs2: remove ocfs2_is_o2cb_active() > > Remove ocfs2_is_o2cb_active(). We have similar functions to identify > which cluster

Re: [Ocfs2-devel] [PATCH 2/3] ocfs2: add ocfs2_overwrite_io function

2017-11-27 Thread Changwei Ge
On 2017/11/28 13:44, Gang He wrote: > Hi Changwei, > > >> Hi, >> Gang >> >> On 2017/11/27 17:48, Gang He wrote: >>> Add ocfs2_overwrite_io function, which is used to judge if >>> overwrite allocated blocks, otherwise, the write will bring extra >>> block allocation overhead. >>> >> >> Can

Re: [Ocfs2-devel] [PATCH 1/3] ocfs2: add ocfs2_try_rw_lock and ocfs2_try_inode_lock

2017-11-27 Thread Changwei Ge
Hi Gang, On 2017/11/27 17:48, Gang He wrote: > Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which > will be used in non-block IO scenarios. > > Signed-off-by: Gang He > --- > fs/ocfs2/dlmglue.c | 22 ++ > fs/ocfs2/dlmglue.h | 4 > 2

Re: [Ocfs2-devel] [PATCH 2/3] ocfs2: add ocfs2_overwrite_io function

2017-11-27 Thread Changwei Ge
On 2017/11/28 9:52, piaojun wrote: > Hi Gang, > > If ocfs2_overwrite_io is only called in 'nowait' scenarios, I wonder if > we can discard 'int wait' just as ext4 does: > > static bool ext4_overwrite_io(struct inode *inode, loff_t pos, loff_t len); Yes, Jun has a point. It seems that

Re: [Ocfs2-devel] [PATCH 2/3] ocfs2: add ocfs2_overwrite_io function

2017-11-27 Thread Changwei Ge
Hi, Gang On 2017/11/27 17:48, Gang He wrote: > Add ocfs2_overwrite_io function, which is used to judge if > overwrite allocated blocks, otherwise, the write will bring extra > block allocation overhead. > Can you elaborate how this overhead is introduced? Forgive me, I don't figure it. Thanks,

[Ocfs2-devel] [PATCH] ocfs2/cluster: neaten a member of o2net_msg_handler

2017-12-04 Thread Changwei Ge
It's odd that o2net_msg_handler::nh_func_data is declared as type o2net_msg_handler_func*. So neaten it. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/cluster/tcp_internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ocfs2/cluster/tcp_interna

[Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/aops.c | 44 ++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index d151632..a982cf6 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/

Re: [Ocfs2-devel] [PATCH] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()

2017-12-18 Thread Changwei Ge
On 2017/12/18 19:52, Joseph Qi wrote: > > > On 17/12/18 18:22, alex chen wrote: >> In dlm_reset_mleres_owner(), we will lock >> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock, >> which breaks the spinlock lock ordering: >> dlm_domain_lock >> struct dlm_ctxt->spinlock >>

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
So introduce file hole check function back into ocfs2. >> Once ocfs2 is doing dio upon a file hole with append-dio disabled, it will >> fall >> back to buffer IO to allocate clusters. >> >> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> >> --- >>fs

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
On 2017/12/19 11:41, piaojun wrote: > Hi Changwei, > > On 2017/12/19 11:05, Changwei Ge wrote: >> Hi Jun, >> >> On 2017/12/19 9:48, piaojun wrote: >>> Hi Changwei, >>> >>> On 2017/12/18 20:06, Changwei Ge wrote: >>>> Befor

Re: [Ocfs2-devel] [PATCH v2] ocfs2: check the metadate alloc before marking extent written

2017-12-13 Thread Changwei Ge
On 2017/12/13 15:25, alex chen wrote: > Hi Changwei, > > On 2017/12/12 9:05, Changwei Ge wrote: >> Hi Alex, >> >> On 2017/12/5 11:31, alex chen wrote: >>> We need to check the free number of the records in each loop to mark >>> extent written, because

Re: [Ocfs2-devel] [PATCH v2] ocfs2: check the metadate alloc before marking extent written

2017-12-13 Thread Changwei Ge
On 2017/12/14 11:04, alex chen wrote: > Hi Changwei, > > On 2017/12/14 8:56, Changwei Ge wrote: >> On 2017/12/13 15:25, alex chen wrote: >>> Hi Changwei, >>> >>> On 2017/12/12 9:05, Changwei Ge wrote: >>>> Hi Alex, >>>> >>

[Ocfs2-devel] [PATCH] ocfs2: clean dead code in suballoc.c

2017-12-12 Thread Changwei Ge
Stack variable fe is no longer used, so trim it to save some cpu cycles and stack space. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/suballoc.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 71f22c8fbffd..a74108

[Ocfs2-devel] [PATCH] ocfs2/cluster: clean up unused function declaration in heartbeat.h

2017-12-12 Thread Changwei Ge
Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/cluster/heartbeat.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/ocfs2/cluster/heartbeat.h b/fs/ocfs2/cluster/heartbeat.h index 3ef5137dc362..a9e67efc0004 100644 --- a/fs/ocfs2/cluster/heartbeat.h +++ b/fs/ocfs2/c

[Ocfs2-devel] [PATCH v2] ocfs2: clean dead code in suballoc.c

2017-12-12 Thread Changwei Ge
Stack variable fe is no longer used, so trim it to save some cpu cycles and stack space. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/suballoc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
On 2017/12/19 9:24, Joseph Qi wrote: > > On 17/12/19 05:53, Andrew Morton wrote: >> On Mon, 18 Dec 2017 12:06:21 +0000 Changwei Ge <ge.chang...@h3c.com> wrote: >> >>> Before ocfs2 supporting allocating clusters while doing append-dio, all >>>

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
On 2017/12/19 9:39, Joseph Qi wrote: > > > On 17/12/19 09:27, Andrew Morton wrote: >> On Tue, 19 Dec 2017 09:24:17 +0800 Joseph Qi wrote: >> --- a/fs/ocfs2/aops.c~ocfs2-fall-back-to-buffer-io-when-append-dio-is-disabled-with-file-hole-existing-fix +++

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
On 2017/12/19 5:54, Andrew Morton wrote: > On Mon, 18 Dec 2017 12:06:21 +0000 Changwei Ge <ge.chang...@h3c.com> wrote: > >> Before ocfs2 supporting allocating clusters while doing append-dio, all >> append >> dio will fall back to buffer io to allocate clusters f

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
Hi Jun, On 2017/12/19 9:48, piaojun wrote: > Hi Changwei, > > On 2017/12/18 20:06, Changwei Ge wrote: >> Before ocfs2 supporting allocating clusters while doing append-dio, all >> append >> dio will fall back to buffer io to allocate clusters firstly. Also, when

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-19 Thread Changwei Ge
Hi Junxiao, On 2017/12/19 16:15, Junxiao Bi wrote: > Hi Changwei, > > On 12/19/2017 02:02 PM, Changwei Ge wrote: >> On 2017/12/19 11:41, piaojun wrote: >>> Hi Changwei, >>> >>> On 2017/12/19 11:05, Changwei Ge wrote: >>>> Hi Jun, >>

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-18 Thread Changwei Ge
; not >>>> right, since whether append-io is enabled tells the capability whether >>>> ocfs2 >>>> can >>>> allocate space while doing dio. >>>> So introduce file hole check function back into ocfs2. >>>> Once ocfs2 is doi

Re: [Ocfs2-devel] [PATCH] ocfs2: fall back to buffer IO when append dio is disabled with file hole existing

2017-12-19 Thread Changwei Ge
On 2017/12/19 17:30, Junxiao Bi wrote: > On 12/19/2017 05:11 PM, Changwei Ge wrote: >> Hi Junxiao, >> >> On 2017/12/19 16:15, Junxiao Bi wrote: >>> Hi Changwei, >>> >>> On 12/19/2017 02:02 PM, Changwei Ge wrote: >>>> On 2017/12/19 11:41,

Re: [Ocfs2-devel] [PATCH] ocfs2/cluster: clean up unused function declaration in heartbeat.h

2017-12-12 Thread Changwei Ge
Yes, Jun has cleaned up those declarations. I will rebase my tree. Thanks, Changwei On 2017/12/13 12:06, Joseph Qi wrote: > These have already been cleaned up in commit > 98d6c09ec2899a9a601b16ec7ae31d54e6b100b9. > > On 17/12/13 11:19, Changwei Ge wrote: >> Signed-off-by: Cha

[Ocfs2-devel] [PATCH v3] ocfs2: clean dead code in suballoc.c

2017-12-12 Thread Changwei Ge
Stack variable fe is no longer used, so trim it to save some CPU cycles and stack space. Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/suballoc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c

Re: [Ocfs2-devel] [PATCH v2] ocfs2: clean dead code in suballoc.c

2017-12-12 Thread Changwei Ge
On 2017/12/13 12:03, Joseph Qi wrote: > > > On 17/12/13 11:51, Changwei Ge wrote: >> Stack variable fe is no longer used, so trim it to save some cpu cycles >> and stack space. >> >> -BUG_ON(start_blk != ocfs2_clusters_to_blocks(bitmap_inode->i_sb, >

[Ocfs2-devel] [PATCH] ocfs2/dlm: get mle inuse only when it is initialized

2017-11-13 Thread Changwei Ge
When dlm_add_migration_mle returns -EEXIST, previously input mle will not be initialized. So we can't use its associated dlm object. And we truly don't need this mle for already launched migration progress, since oldmle has taken this role. Signed-off-by: Changwei Ge <ge.chang...@h3c.

[Ocfs2-devel] [PATCH v2] ocfs2/dlm: get mle inuse only when it is initialized

2017-11-14 Thread Changwei Ge
. Thanks, Changwei Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmmaster.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 3e04279446e8..9c3e0f13ca87 100644 --- a/fs/ocfs2/dlm/dlmmaster.c ++

Re: [Ocfs2-devel] [PATCH] Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!

2017-11-20 Thread Changwei Ge
Hi John, It's better to paste your patch directly into message body. It's easy for reviewing. So I copied your patch below: > The dw_zero_count tracking was assuming that w_unwritten_list would > always contain one element. The actual count is now tracked whenever > the list is extended. > ---

Re: [Ocfs2-devel] [PATCH] Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!

2017-11-20 Thread Changwei Ge
On 2017/11/21 10:45, John Lightsey wrote: > On Tue, 2017-11-21 at 00:58 +0000, Changwei Ge wrote: >>> @@ -873,6 +875,7 @@ static int ocfs2_alloc_write_ctxt(struct >>> ocfs2_write_ctxt **wcp, >>> >>>   ocfs2_init_dealloc_ctxt(>w_dealloc); >>>

Re: [Ocfs2-devel] [RFC] make ocfs2/o2net reliable

2017-11-16 Thread Changwei Ge
Hi Yiwen, On 2017/11/17 11:06, jiangyiwen wrote: > On 2017/11/16 17:49, Changwei Ge wrote: >> Hi all, >> As far as we know, ocfs2/o2net is not a reliable message mechanism. >> Messages might get lost due to a sudden TCP socket connection shutdown. > Hi Changwei, > &

Re: [Ocfs2-devel] [RFC] make ocfs2/o2net reliable

2017-11-16 Thread Changwei Ge
Hi Gang On 2017/11/17 10:24, Gang He wrote: > > > >> On 2017/11/16 18:05, Gang He wrote: >>> Hello Changwei, >>> >>> Base on your description, it looks make sense. >>> Since I uses fs/dlm kernel module, it looks stable. >>> Do you compare both dlm implementation? maybe can learn from each

Re: [Ocfs2-devel] [PATCH] ocfs2: The goto is not useful in the function ocfs2_reserve_cluster_bitmap_bits, so remove it.

2017-11-15 Thread Changwei Ge
Hi Zhonghua, On 2017/11/15 20:04, Guozhonghua wrote: > The goto is not useful anymore, removed from the context. Perhaps we can make this change-log more clear like: The bail declare is not necessary any more, so trim it. If code path falls into error branch, ocfs2_reserve_cluster_bitmap_bits

Re: [Ocfs2-devel] [RFC] make ocfs2/o2net reliable

2017-11-16 Thread Changwei Ge
Hi Wengang, Thanks for your comments and inspiration. On 2017/11/17 7:05, Wengang Wang wrote: > > > On 2017/11/16 1:49, Changwei Ge wrote: >> Hi all, >> As far as we know, ocfs2/o2net is not a reliable message mechanism. >> Messages might get lost due to a sudden TCP

Re: [Ocfs2-devel] [RFC] make ocfs2/o2net reliable

2017-11-16 Thread Changwei Ge
On 2017/11/16 18:05, Gang He wrote: > Hello Changwei, > > Base on your description, it looks make sense. > Since I uses fs/dlm kernel module, it looks stable. > Do you compare both dlm implementation? maybe can learn from each other. > > > Thanks > Gang Hi Gang, Actually , I have studied some

Re: [Ocfs2-devel] [RFC] make ocfs2/o2net reliable

2017-11-16 Thread Changwei Ge
On 2017/11/17 13:51, jiangyiwen wrote: > On 2017/11/17 11:53, Changwei Ge wrote: >> Hi Yiwen, >> >> On 2017/11/17 11:06, jiangyiwen wrote: >>> On 2017/11/16 17:49, Changwei Ge wrote: >>>> Hi all, >>>> As far as we know, ocfs2/o2net is not a re

Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-11-01 Thread Changwei Ge
, I suppose. Please take a review. Thanks, Changwei Subject: [PATCH] ocfs2/dlm: a node can't be involved in recovery if it is being shutdown Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmdomain.c | 4 fs/ocfs2/dlm/dlmrecovery.c | 3 +++ 2 files chan

Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-11-01 Thread Changwei Ge
figure out a better way to solve this, as your patch can't clear DLM RECOVERING flag on lock resource. I am not sure if it is reasonable, I suppose this may violate ocfs2/dlm design philosophy. Thanks, Changwei > > thanks, > Jun > > On 2017/11/1 16:11, Changwei Ge wrote: >> H

Re: [Ocfs2-devel] [PATCH] ocfs2/cluster: unlock the o2hb_live_lock before the o2nm_depend_item()

2017-11-01 Thread Changwei Ge
Hi Alex, On 2017/11/1 15:05, alex chen wrote: > Hi Joseph and Changwei, > > It's our basic principle that the function in which may sleep can't be called > within spinlock hold. I suppose this principle is a suggestion not a restriction. > > On 2017/11/1 9:03, Joseph Qi wrote: >> Hi Alex, >>

Re: [Ocfs2-devel] [PATCH] ocfs2/cluster: unlock the o2hb_live_lock before the o2nm_depend_item()

2017-11-01 Thread Changwei Ge
Hi Alex, On 2017/11/1 17:52, alex chen wrote: > Hi Changwei, > > Thanks for you reply. > > On 2017/11/1 16:28, Changwei Ge wrote: >> Hi Alex, >> >> On 2017/11/1 15:05, alex chen wrote: >>> Hi Joseph and Changwei, >>> >>> It's ou

Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-11-01 Thread Changwei Ge
our patch. :-( So no DLM_FINALIZE_RECO_MSG message will be sent out to other nodes, thus DLM_LOCK_RES_RECOVERING can't be cleared. As I know, if DLM_LOCK_RES_RECOVERING is set, all lock and unlock requests will be *hang*. Thanks, Changwei > > thanks, > Jun > > On 2017/11/1 17

Re: [Ocfs2-devel] [PATCH] ocfs2/cluster: unlock the o2hb_live_lock before the o2nm_depend_item()

2017-11-02 Thread Changwei Ge
On 2017/11/2 11:45, alex chen wrote: > Hi Changwei, > > On 2017/11/1 17:59, Changwei Ge wrote: >> Hi Alex, >> >> On 2017/11/1 17:52, alex chen wrote: >>> Hi Changwei, >>> >>> Thanks for you reply. >>> >>> On 2017/11/1 16:28,

Re: [Ocfs2-devel] ocfs2-test result on 4.14.0-rc7-1.gdbf3e9b-vanilla kernel

2017-11-06 Thread Changwei Ge
Hi Eric, On 2017/11/7 10:33, Eric Ren wrote: > Hi, > > The testing result against the recent kernel looks good. The attachments are > overall results. If the detailed logs are needed, please let me know. > > Pattern Failed Passed Skipped Total >

Re: [Ocfs2-devel] [PATCH v2] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-11-02 Thread Changwei Ge
quot;dlm: allow dlm do recovery during shutdown") > > Signed-off-by: Jun Piao <piao...@huawei.com> > Reviewed-by: Alex Chen <alex.c...@huawei.com> > Reviewed-by: Yiwen Jiang <jiangyi...@huawei.com> > Acked-by: Changwei Ge <ge.chang...@h3c.com> Hi

Re: [Ocfs2-devel] [PATCH] ocfs2: fix a potential 'ABBA' deadlock caused by 'l_lock' and 'dentry_attach_lock'

2017-12-08 Thread Changwei Ge
On 2017/12/7 20:37, piaojun wrote: > Hi Changwei, > > On 2017/12/7 19:59, Changwei Ge wrote: >> Hi Jun, >> >> On 2017/12/7 19:30, piaojun wrote: >>> CPUA CPUB >>> >>> ocfs2_dentry_c

Re: [Ocfs2-devel] [PATCH] ocfs2: fix a potential 'ABBA' deadlock caused by 'l_lock' and 'dentry_attach_lock'

2017-12-07 Thread Changwei Ge
Hi Jun, On 2017/12/7 19:30, piaojun wrote: > CPUA CPUB > > ocfs2_dentry_convert_worker > get 'l_lock' This lock belongs to ocfs2_dentry_lock::ocfs2_lock_res::l_lock > > get 'dentry_attach_lock' > >

Re: [Ocfs2-devel] [PATCH] ocfs2: fix a potential 'ABBA' deadlock caused by 'l_lock' and 'dentry_attach_lock'

2017-12-08 Thread Changwei Ge
On 2017/12/8 18:17, piaojun wrote: > Hi Changwei, > > On 2017/12/8 17:09, Changwei Ge wrote: >> On 2017/12/7 20:37, piaojun wrote: >>> Hi Changwei, >>> >>> On 2017/12/7 19:59, Changwei Ge wrote: >>>> Hi Jun, >>>

Re: [Ocfs2-devel] [PATCH v2] ocfs2: check the metadate alloc before marking extent written

2017-12-11 Thread Changwei Ge
Hi Alex, On 2017/12/5 11:31, alex chen wrote: > We need to check the free number of the records in each loop to mark > extent written, because the last extent block may be changed through > many times marking extent written and the 'num_free_extents' also be > changed. In the worst case, the

Re: [Ocfs2-devel] [PATCH] ocfs2: the ip_alloc_sem should be taken in ocfs2_get_block()

2017-10-20 Thread Changwei Ge
ccess to extent tree with > ocfs2_dio_end_io_write(), which may cause BUGON in > ocfs2_get_clusters_nocache()->BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos)) > > Signed-off-by: Alex Chen <alex.c...@huawei.com> > Reviewed-by: Jun Piao <piao...@huawei.com> Acke

Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

2017-10-31 Thread Changwei Ge
Hi Jun, Thanks for reporting. I am very interesting in this issue. But, first of all, I want to make this issue clear, so that I might be able to provide some comments. On 2017/11/1 9:16, piaojun wrote: > wait for dlm recovery done when migrating all lockres in case of new > lockres to be left

Re: [Ocfs2-devel] [PATCH] ocfs2/cluster: unlock the o2hb_live_lock before the o2nm_depend_item()

2017-10-31 Thread Changwei Ge
On 2017/11/1 9:05, Joseph Qi wrote: > Hi Alex, > > On 17/10/31 20:41, alex chen wrote: >> In the following situation, the down_write() will be called under >> the spin_lock(), which may lead a soft lockup: >> o2hb_region_inc_user >> spin_lock(_live_lock) >>o2hb_region_pin >>

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-14 Thread Changwei Ge
ber of bio we need before calling bio_add_page()? > > Thanks > Jun > > On 2018/5/14 11:21, Changwei Ge wrote: >> Hi Jun, >> >> Right now, I am afraid that the easiest and fasted way to fix this issue >> is to revert your patch. >> >> From comm

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-09 Thread Changwei Ge
Hi Jun, On 2018/5/9 16:50, piaojun wrote: > Hi Changwei, > > On 2018/5/8 23:57, Changwei Ge wrote: >> Hi Jun, >> >> Sorry for this so late reply since I was very busy those days. >> >> >> On 04/16/2018 11:44 AM, piaojun wrote: >>> Hi Chang

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-09 Thread Changwei Ge
at if slot number exceeds 16. Thanks, Changwei > > thanks, > Jun > > On 2018/5/9 17:06, Changwei Ge wrote: >> Hi Jun, >> >> >> On 2018/5/9 16:50, piaojun wrote: >>> Hi Changwei, >>> >>> On 2018/5/8 23:57, Changwei Ge

Re: [Ocfs2-devel] 答复: [PATCH] ocfs2: Correct the offset comments of the structure ocfs2_dir_block_trailer.

2018-05-08 Thread Changwei Ge
Hi Zhonghua and Jospeh, I'm afraid that it can't be easy to find the offest 0x20 since ::db_signature only occupy 8 bytes. Thanks, Changwei On 2018/5/9 10:50, Guozhonghua wrote: > Good Idea, I will send patch v2 for review. > > Thanks. > > Guozhonghua. > > -邮件原件- > 发件人: Joseph Qi

Re: [Ocfs2-devel] [PATCH] ocfs2: don't use iocb when EIOCBQUEUED returns

2018-05-08 Thread Changwei Ge
figured out why iocb > was freed. Though you fix won't bring any side effect, it looks like a > workaround. > That means, the freed iocb may still have risk in other place. > > Thanks, > Joseph > > On 18/5/8 23:23, Changwei Ge wrote: >> Hi Gang, >> >> I don

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-09 Thread Changwei Ge
Hi Jun, On 2018/5/9 18:08, piaojun wrote: > Hi Changwei, > > On 2018/4/13 13:51, Changwei Ge wrote: >> If cluster scale exceeds 16 nodes, bio will be full and bio_add_page() >> returns 0 when adding pages to bio. Returning -EIO to o2hb_read_slots() >> from o2h

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-09 Thread Changwei Ge
On 2018/5/10 8:24, piaojun wrote: > > On 2018/5/9 20:01, Changwei Ge wrote: >> Hi Jun, >> >> >> On 2018/5/9 18:08, piaojun wrote: >>> Hi Changwei, >>> >>> On 2018/4/13 13:51, Changwei Ge wrote: >>>> If cluster scale exceeds 16

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-13 Thread Changwei Ge
value is zero or not. Thanks, Changwei On 2018/5/10 9:13, Changwei Ge wrote: > > > On 2018/5/10 8:24, piaojun wrote: >> >> On 2018/5/9 20:01, Changwei Ge wrote: >>> Hi Jun, >>> >>> >>> On 2018/5/9 18:08, piaojun wrote: >>

Re: [Ocfs2-devel] [PATCH] ocfs2: Correct the offset comments of the structure ocfs2_dir_block_trailer.

2018-05-08 Thread Changwei Ge
It looks good to me. On 2018/5/8 5:46 PM, Guozhonghua wrote: > Correct the offset comments of the structure ocfs2_dir_block_trailer. Reviewed-by: Changwei Ge <ge.chang...@h3c.com> > > Signed-off-by: guozhonghua <guozhong...@h3c.com> > --- > fs/ocfs2/ocfs2_fs.h |4

Re: [Ocfs2-devel] [PATCH] ocfs2: don't use iocb when EIOCBQUEUED returns

2018-05-08 Thread Changwei Ge
like a code bug, and 'iocb' should not be freed at this place. >>> Could this BUG reproduced easily? >> Actually, it's not easy to be reproduced since IO is much slower than CPU >> executing instructions. But the logic here is broken, we'd better fix this. >> >> Thanks

Re: [Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

2018-05-08 Thread Changwei Ge
ks, Changwei > > thanks, > Jun > > On 2018/4/13 13:51, Changwei Ge wrote: >> If cluster scale exceeds 16 nodes, bio will be full and bio_add_page() >> returns 0 when adding pages to bio. Returning -EIO to o2hb_read_slots() >> from o2hb_setup_one_bio() will l

Re: [Ocfs2-devel] [PATCH] fix a compiling warning

2018-05-08 Thread Changwei Ge
Hi Larry, It would be better for you to paste the compilation warning and gcc version. So we can justify if it is deserved to be fixed. Thanks, Changwei On 2018/5/7 11:06 AM, Larry Chen wrote: > Hi Jun > > Yeah,I know your logic is right. > It's just a compile warning that made me feel

Re: [Ocfs2-devel] [PATCH] ocfs2: don't put and assign null to bh allocated outside

2018-05-08 Thread Changwei Ge
Friendly ping after one month's silence for this patch. Andrew had picked this patch into -mm tree. Can anyone help review my patch or give some comments.:-) Thanks, Changwei On 04/10/2018 07:35 PM, Changwei Ge wrote: > ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to r

Re: [Ocfs2-devel] [PATCH] ocfs2: the ip_alloc_sem should be taken in ocfs2_get_block()

2017-10-20 Thread Changwei Ge
Hi Alex, Are you able to provide a way to reproduce this issue? I'm very interested in it. Thanks, Changwei. On 2017/10/20 17:08, alex chen wrote: > The ip_alloc_sem should be taken in ocfs2_get_block() when reading file > in DIRECT mode to prevent concurrent access to extent tree with >

Re: [Ocfs2-devel] [PATCH] ocfs2: should wait dio before inode lock in ocfs2_setattr()

2017-10-27 Thread Changwei Ge
Hi Alex, Thanks for reporting. I probably get your point. You mean that for a lock resource(say A), it is used to protect metadata changing among nodes in cluster. Unfortunately, it was marks as BLOCKED since it was granted with a EX lock, and the lock can't be unblocked since it has more or

Re: [Ocfs2-devel] [PATCH] ocfs2: should wait dio before inode lock in ocfs2_setattr()

2017-10-28 Thread Changwei Ge
Hi Alex, On 2017/10/28 10:35, alex chen wrote: > Hi Changwei, > > Thanks for you reply. > > On 2017/10/27 18:21, Changwei Ge wrote: >> Hi Alex, >> >> Thanks for reporting. >> I probably get your point. You mean that for a lock resource(say A), it >>

Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()

2017-12-21 Thread Changwei Ge
On 2017/12/21 14:36, alex chen wrote: > Hi Joseph, > > On 2017/12/21 9:30, Joseph Qi wrote: >> Hi Alex, >> >> On 17/12/21 08:55, alex chen wrote: >>> In dlm_reset_mleres_owner(), we will lock >>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock, >>> which breaks the spinlock lock

Re: [Ocfs2-devel] [PATCH] ocfs2: don't merge rightmost extent block if it was locked

2017-12-24 Thread Changwei Ge
will post it. Thanks, Changwei On 2017/12/22 20:25, alex chen wrote: > Hi Changwei, > > On 2017/12/22 14:41, Changwei Ge wrote: >> A crash issue was reported by John. >> >> The call trace follows: >> ocfs2_split_extent+0x1ad3/0x1b40 [ocfs2] >> ocfs

[Ocfs2-devel] [PATCH RESEND v3 1/2] ocfs2: make metadata estimation accurate and clear

2018-01-08 Thread Changwei Ge
For Andrew's convenience, resend this patch Current code assume that ::w_unwritten_list always has only one item on. This is not right and hard to get understood. So improve how to count unwritten item. Reported-by: John Lightsey <j...@nixnuts.net> Signed-off-by: Changwei Ge <ge.chang..

[Ocfs2-devel] [PATCH RESEND v3 2/2] ocfs2: try to reuse extent block in dealloc without meta_alloc

2018-01-08 Thread Changwei Ge
ace of local slot at higher priority. Reported-by: John Lightsey <j...@nixnuts.net> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/alloc.c | 208 --- fs/ocfs2/alloc.h | 1 + fs/ocfs2/aops.c | 6 ++ 3 files changed,

Re: [Ocfs2-devel] [PATCH v2] ocfs2: try to reuse extent block in dealloc without meta_alloc

2018-01-07 Thread Changwei Ge
Hi John, Sorry for reply you so late since too busy these days. Thanks for your contribution for this issue. Thanks to the reproducer you provided, I have reproduced the crash issue you reported. The back trace was found. ocfs2_mark_extent_written ocfs2_change_extent_flag

[Ocfs2-devel] [PATCH 1/2] ocfs2: make metadata estimation accurate and clear

2018-01-07 Thread Changwei Ge
Current code assume that ::w_unwritten_list always has only one item on. This is not right and hard to get understood. So improve how to count unwritten item. Reported-by: John Lightsey <j...@nixnuts.net> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/aops.c | 4 +

[Ocfs2-devel] [PATCH 2/2 v3] ocfs2: try to reuse extent block in dealloc without

2018-01-07 Thread Changwei Ge
-by: John Lightsey <j...@nixnuts.net> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> --- fs/ocfs2/alloc.c | 208 --- fs/ocfs2/alloc.h | 1 + fs/ocfs2/aops.c | 6 ++ 3 files changed, 205 insertions(+), 10 deletions(-) dif

Re: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster

2018-01-10 Thread Changwei Ge
On 2018/1/11 11:33, Gang He wrote: > Hi Changwei, > > >> On 2018/1/11 10:07, Gang He wrote: >>> Hi Changwei, >>> >>> >> On 2018/1/10 18:14, Gang He wrote: > Hi Changwei, > > >> On 2018/1/10 17:05, Gang He wrote: >>> Hi Changwei, >>> >>>

  1   2   >