Hi,
We encountered a LIVELOCK problem while performing a test case of
high-concurrency read between different nodes.
It's very easy to reproduce this issue.
Just perform high-concurrency read operation against one single large
file (say named test_file) from one physical node,
meanwhile, perform
: Changwei Ge <ge.chang...@h3c.com>
Date: Wed, 11 Jan 2017 09:05:35 +0800
Subject: [PATCH] fix umount hang after journal flushing failure
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/journal.c | 18 ++
1 file changed, 18 insertions(+)
diff --git a/fs/ocfs2/jo
Hi Gang,
More log before crash is pasted.
Also, I gathered some structure's content, which may be useful to
analyze this issue.
Paste them here too.
struct dio_submit {
bio = 0x88035690b800,
blkbits = 0x9,
blkfactor = 0x3,
start_zero_done = 0x1,
pages_in_io = 0x34,
Hi,
We encountered a crash issue days ago.
The call trace follows as below:
>From the call trace, we can see that a direct read request caused this
crash issue, which triggered a BUG_ON check point.
With the help of debugfs.ocfs2 tool, I can see that clusters owned by
the target file are
Hi Joseph,
On 2017/8/10 17:53, Joseph Qi wrote:
> Hi Changwei,
>
> On 17/8/9 23:24, ge changwei wrote:
>> Hi
>>
>>
>> On 2017/8/9 下午7:32, Joseph Qi wrote:
>>> Hi,
>>>
>>> On 17/8/7 15:13, Changwei Ge wrote:
>>>> Hi,
>
On 2017/8/8 4:20, Mark Fasheh wrote:
> On Mon, Aug 7, 2017 at 2:13 AM, Changwei Ge <ge.chang...@h3c.com> wrote:
>> Hi,
>>
>> In current code, while flushing AST, we don't handle an exception that
>> sending AST or BAST is failed.
>> But it is indeed possib
.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/dlm/dlmrecovery.c | 51
++--
fs/ocfs2/dlm/dlmthread.c | 39 +++--
2 files changed, 81 insertions(+), 9 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.
> And the re-queuing AST or BAST will be dropped if the requesting node is
>> dead!
>>
>> It will improve the reliability a lot.
>>
>>
>> Thanks.
>>
>> Changwei.
>>
>> Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
>> ---
>> fs
On 2017/8/23 12:48, Gang He wrote:
>
>
>> On 17/8/23 10:23, Junxiao Bi wrote:
>>> On 08/10/2017 06:49 PM, Changwei Ge wrote:
>>>> Hi Joseph,
>>>>
>>>>
>>>> On 2017/8/10 17:53, Joseph Qi wrote:
>>
On 2017/9/19 14:47, Michael Ulbrich wrote:
> Hi Changwei,
>
> thanks for looking into this!
>
> On 19/09/17 05:32, Changwei Ge wrote:
>
>> Could you please also provide information about *slot_map*, just type
>> "slotmap" in debugfs.ocfs2 tool. Th
Hi Michael,
On 2017/9/18 23:45, Michael Ulbrich wrote:
> Hi again,
>
> chatting with a helpful person on #ocfs2 IRC channel this morning I got
> encouraged to cross-post to ocsf2-devel. For historic background and
> further details pls. see my two previous posts to ocfs2-users from last
> week
...@gmail.com>
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/dlm/dlmrecovery.c |1 +
1 file changed, 1 insertion(+)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 74407c6..ec8f758 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dl
/23 10:23, Junxiao Bi wrote:
>>>> On 08/10/2017 06:49 PM, Changwei Ge wrote:
>>>>> Hi Joseph,
>>>>>
>>>>>
>>>>> On 2017/8/10 17:53, Joseph Qi wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>
Acked-by: Changwei Ge <ge.chang...@h3c.com>
On 2017/9/28 19:16, Guozhonghua wrote:
>
> Remove unused function ocfs2_publish_get_mount_state.
>
> Signed-off-by: guozhonghua <guozhong...@h3c.com>
> ---
> fs/ocfs2/super.h |3 ---
> 1 file changed, 3 deleti
Hi,
We'd better not add a white space at the tail of a line, so trim it!
Thanks,
Changwei
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/cluster/tcp.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
mode change 100644 => 100755 fs/ocfs2/cluster/tcp.c
d
On 2017/9/27 14:40, Gang He wrote:
> ** PRIVATE **
>
> Hello guys,
>
> As you know, some Linux distributions(e.g. SUSE Enterprise Linux 15) will
> introduce Python3 as the default,
> our Python scripts in ocfs2-test still use Python2, we will have to do proper
> modifications to migration to
Hi Andrew and Vitaly,
I do agree that patch ee8f7fcbe638 ("ocfs2/dlm: continue to purge
recovery lockres when recovery master goes down", 2016-08-02) introduced
an issue. It makes DLM recovery can't pick up a new master for an
existed lock resource whose owner died seconds ago.
But this patch
n provide some better
or more detailed clues.
Thanks,
Changwei.
>
> On 2017/10/17 14:48, Changwei Ge wrote:
>> When a node dies, other live nodes have to choose a new master
>> for an existed lock resource mastered by the dead node.
>>
>> As for ocfs2/dlm i
On 2017/10/18 7:21, Andrew Morton wrote:
> On Thu, 21 Sep 2017 02:09:33 + Zhangyang wrote:
>
>> In our test, We fond that , when the network down, qs->qs_holds could not be
>> reduce to zero, it will lead to the node can't do fence.
>>
>>
>>
>> o2net_idle_timer ->
This code snippet is no longer used. So trim it, thus to make code neat.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/dlm/dlmmaster.c | 7 ---
1 file changed, 7 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 3e04279446e8..93b61c
Hi Alex,
I just reviewed your patch and a few questions were come up with.
On 2017/11/24 13:49, alex chen wrote:
> Hi John,
>
> I think a better method to solve this problem.
>
> On 2017/11/22 5:05, John Lightsey wrote:
>> On Tue, 2017-11-21 at 05:58 +, Changwei Ge w
It looks fine to me.
On 2017/11/29 16:39, Gang He wrote:
> Add ocfs2_overwrite_io function, which is used to judge if
> overwrite allocated blocks, otherwise, the write will bring extra
> block allocation overhead.
>
> Signed-off-by: Gang He <g...@suse.com>
Reviewed-by:
On 2017/11/29 16:38, Gang He wrote:
> Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which
> will be used in non-block IO scenarios.
>
> Signed-off-by: Gang He
> ---
> fs/ocfs2/dlmglue.c | 21 +
> fs/ocfs2/dlmglue.h | 4
> 2 files changed,
Hi Gang,
On 2017/11/30 10:45, Gang He wrote:
> Hello Changwei,
>
>
>> On 2017/11/29 16:38, Gang He wrote:
>>> Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which
>>> will be used in non-block IO scenarios.
>>>
>>> Signed-off-by: Gang He
>>> ---
>>>
I NACK to this patch since I don't think it can solve the issue Jun
reported completely.
Thanks,
Changwei
On 2017/12/1 6:26, a...@linux-foundation.org wrote:
> From: piaojun
> Subject: ocfs2/dlm: wait for dlm recovery done when migrating all lockres
>
> wait for dlm
.ya...@h3c.com>
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/cluster/quorum.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/cluster/quorum.c b/fs/ocfs2/cluster/quorum.c
index 62e8ec619b4c..af2e7473956e 100644
--- a/fs/ocfs2/cluster/quoru
Acked-by: Changwei Ge <ge.chang...@h3c.com>
On 2017/12/1 6:25, a...@linux-foundation.org wrote:
> From: Gang He <g...@suse.com>
> Subject: ocfs2: remove ocfs2_is_o2cb_active()
>
> Remove ocfs2_is_o2cb_active(). We have similar functions to identify
> which cluster
On 2017/11/28 13:44, Gang He wrote:
> Hi Changwei,
>
>
>> Hi,
>> Gang
>>
>> On 2017/11/27 17:48, Gang He wrote:
>>> Add ocfs2_overwrite_io function, which is used to judge if
>>> overwrite allocated blocks, otherwise, the write will bring extra
>>> block allocation overhead.
>>>
>>
>> Can
Hi Gang,
On 2017/11/27 17:48, Gang He wrote:
> Add ocfs2_try_rw_lock and ocfs2_try_inode_lock functions, which
> will be used in non-block IO scenarios.
>
> Signed-off-by: Gang He
> ---
> fs/ocfs2/dlmglue.c | 22 ++
> fs/ocfs2/dlmglue.h | 4
> 2
On 2017/11/28 9:52, piaojun wrote:
> Hi Gang,
>
> If ocfs2_overwrite_io is only called in 'nowait' scenarios, I wonder if
> we can discard 'int wait' just as ext4 does:
>
> static bool ext4_overwrite_io(struct inode *inode, loff_t pos, loff_t len);
Yes, Jun has a point.
It seems that
Hi,
Gang
On 2017/11/27 17:48, Gang He wrote:
> Add ocfs2_overwrite_io function, which is used to judge if
> overwrite allocated blocks, otherwise, the write will bring extra
> block allocation overhead.
>
Can you elaborate how this overhead is introduced?
Forgive me, I don't figure it.
Thanks,
It's odd that o2net_msg_handler::nh_func_data is declared as type
o2net_msg_handler_func*.
So neaten it.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/cluster/tcp_internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/cluster/tcp_interna
.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/aops.c | 44 ++--
1 file changed, 42 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index d151632..a982cf6 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/
On 2017/12/18 19:52, Joseph Qi wrote:
>
>
> On 17/12/18 18:22, alex chen wrote:
>> In dlm_reset_mleres_owner(), we will lock
>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>> which breaks the spinlock lock ordering:
>> dlm_domain_lock
>> struct dlm_ctxt->spinlock
>>
So introduce file hole check function back into ocfs2.
>> Once ocfs2 is doing dio upon a file hole with append-dio disabled, it will
>> fall
>> back to buffer IO to allocate clusters.
>>
>> Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
>> ---
>>fs
On 2017/12/19 11:41, piaojun wrote:
> Hi Changwei,
>
> On 2017/12/19 11:05, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2017/12/19 9:48, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2017/12/18 20:06, Changwei Ge wrote:
>>>> Befor
On 2017/12/13 15:25, alex chen wrote:
> Hi Changwei,
>
> On 2017/12/12 9:05, Changwei Ge wrote:
>> Hi Alex,
>>
>> On 2017/12/5 11:31, alex chen wrote:
>>> We need to check the free number of the records in each loop to mark
>>> extent written, because
On 2017/12/14 11:04, alex chen wrote:
> Hi Changwei,
>
> On 2017/12/14 8:56, Changwei Ge wrote:
>> On 2017/12/13 15:25, alex chen wrote:
>>> Hi Changwei,
>>>
>>> On 2017/12/12 9:05, Changwei Ge wrote:
>>>> Hi Alex,
>>>>
>>
Stack variable fe is no longer used, so trim it to save some cpu cycles
and stack space.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/suballoc.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 71f22c8fbffd..a74108
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/cluster/heartbeat.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/ocfs2/cluster/heartbeat.h b/fs/ocfs2/cluster/heartbeat.h
index 3ef5137dc362..a9e67efc0004 100644
--- a/fs/ocfs2/cluster/heartbeat.h
+++ b/fs/ocfs2/c
Stack variable fe is no longer used, so trim it to save some cpu cycles
and stack space.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/suballoc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
On 2017/12/19 9:24, Joseph Qi wrote:
>
> On 17/12/19 05:53, Andrew Morton wrote:
>> On Mon, 18 Dec 2017 12:06:21 +0000 Changwei Ge <ge.chang...@h3c.com> wrote:
>>
>>> Before ocfs2 supporting allocating clusters while doing append-dio, all
>>>
On 2017/12/19 9:39, Joseph Qi wrote:
>
>
> On 17/12/19 09:27, Andrew Morton wrote:
>> On Tue, 19 Dec 2017 09:24:17 +0800 Joseph Qi wrote:
>>
---
a/fs/ocfs2/aops.c~ocfs2-fall-back-to-buffer-io-when-append-dio-is-disabled-with-file-hole-existing-fix
+++
On 2017/12/19 5:54, Andrew Morton wrote:
> On Mon, 18 Dec 2017 12:06:21 +0000 Changwei Ge <ge.chang...@h3c.com> wrote:
>
>> Before ocfs2 supporting allocating clusters while doing append-dio, all
>> append
>> dio will fall back to buffer io to allocate clusters f
Hi Jun,
On 2017/12/19 9:48, piaojun wrote:
> Hi Changwei,
>
> On 2017/12/18 20:06, Changwei Ge wrote:
>> Before ocfs2 supporting allocating clusters while doing append-dio, all
>> append
>> dio will fall back to buffer io to allocate clusters firstly. Also, when
Hi Junxiao,
On 2017/12/19 16:15, Junxiao Bi wrote:
> Hi Changwei,
>
> On 12/19/2017 02:02 PM, Changwei Ge wrote:
>> On 2017/12/19 11:41, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2017/12/19 11:05, Changwei Ge wrote:
>>>> Hi Jun,
>>
; not
>>>> right, since whether append-io is enabled tells the capability whether
>>>> ocfs2
>>>> can
>>>> allocate space while doing dio.
>>>> So introduce file hole check function back into ocfs2.
>>>> Once ocfs2 is doi
On 2017/12/19 17:30, Junxiao Bi wrote:
> On 12/19/2017 05:11 PM, Changwei Ge wrote:
>> Hi Junxiao,
>>
>> On 2017/12/19 16:15, Junxiao Bi wrote:
>>> Hi Changwei,
>>>
>>> On 12/19/2017 02:02 PM, Changwei Ge wrote:
>>>> On 2017/12/19 11:41,
Yes, Jun has cleaned up those declarations.
I will rebase my tree.
Thanks,
Changwei
On 2017/12/13 12:06, Joseph Qi wrote:
> These have already been cleaned up in commit
> 98d6c09ec2899a9a601b16ec7ae31d54e6b100b9.
>
> On 17/12/13 11:19, Changwei Ge wrote:
>> Signed-off-by: Cha
Stack variable fe is no longer used, so trim it to save some CPU cycles
and stack space.
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/suballoc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
On 2017/12/13 12:03, Joseph Qi wrote:
>
>
> On 17/12/13 11:51, Changwei Ge wrote:
>> Stack variable fe is no longer used, so trim it to save some cpu cycles
>> and stack space.
>>
>> -BUG_ON(start_blk != ocfs2_clusters_to_blocks(bitmap_inode->i_sb,
>
When dlm_add_migration_mle returns -EEXIST, previously input mle will
not be initialized. So we can't use its associated dlm object.
And we truly don't need this mle for already launched migration
progress, since oldmle has taken this role.
Signed-off-by: Changwei Ge <ge.chang...@h3c.
.
Thanks,
Changwei
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/dlm/dlmmaster.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 3e04279446e8..9c3e0f13ca87 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
++
Hi John,
It's better to paste your patch directly into message body. It's easy
for reviewing.
So I copied your patch below:
> The dw_zero_count tracking was assuming that w_unwritten_list would
> always contain one element. The actual count is now tracked whenever
> the list is extended.
> ---
On 2017/11/21 10:45, John Lightsey wrote:
> On Tue, 2017-11-21 at 00:58 +0000, Changwei Ge wrote:
>>> @@ -873,6 +875,7 @@ static int ocfs2_alloc_write_ctxt(struct
>>> ocfs2_write_ctxt **wcp,
>>>
>>> ocfs2_init_dealloc_ctxt(>w_dealloc);
>>>
Hi Yiwen,
On 2017/11/17 11:06, jiangyiwen wrote:
> On 2017/11/16 17:49, Changwei Ge wrote:
>> Hi all,
>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>> Messages might get lost due to a sudden TCP socket connection shutdown.
> Hi Changwei,
>
&
Hi Gang
On 2017/11/17 10:24, Gang He wrote:
>
>
>
>> On 2017/11/16 18:05, Gang He wrote:
>>> Hello Changwei,
>>>
>>> Base on your description, it looks make sense.
>>> Since I uses fs/dlm kernel module, it looks stable.
>>> Do you compare both dlm implementation? maybe can learn from each
Hi Zhonghua,
On 2017/11/15 20:04, Guozhonghua wrote:
> The goto is not useful anymore, removed from the context.
Perhaps we can make this change-log more clear like:
The bail declare is not necessary any more, so trim it. If code path
falls into error branch, ocfs2_reserve_cluster_bitmap_bits
Hi Wengang,
Thanks for your comments and inspiration.
On 2017/11/17 7:05, Wengang Wang wrote:
>
>
> On 2017/11/16 1:49, Changwei Ge wrote:
>> Hi all,
>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>> Messages might get lost due to a sudden TCP
On 2017/11/16 18:05, Gang He wrote:
> Hello Changwei,
>
> Base on your description, it looks make sense.
> Since I uses fs/dlm kernel module, it looks stable.
> Do you compare both dlm implementation? maybe can learn from each other.
>
>
> Thanks
> Gang
Hi Gang,
Actually , I have studied some
On 2017/11/17 13:51, jiangyiwen wrote:
> On 2017/11/17 11:53, Changwei Ge wrote:
>> Hi Yiwen,
>>
>> On 2017/11/17 11:06, jiangyiwen wrote:
>>> On 2017/11/16 17:49, Changwei Ge wrote:
>>>> Hi all,
>>>> As far as we know, ocfs2/o2net is not a re
, I suppose.
Please take a review.
Thanks,
Changwei
Subject: [PATCH] ocfs2/dlm: a node can't be involved in recovery if it
is being shutdown
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/dlm/dlmdomain.c | 4
fs/ocfs2/dlm/dlmrecovery.c | 3 +++
2 files chan
figure out a better way to solve this, as your patch
can't clear DLM RECOVERING flag on lock resource. I am not sure if it is
reasonable, I suppose this may violate ocfs2/dlm design philosophy.
Thanks,
Changwei
>
> thanks,
> Jun
>
> On 2017/11/1 16:11, Changwei Ge wrote:
>> H
Hi Alex,
On 2017/11/1 15:05, alex chen wrote:
> Hi Joseph and Changwei,
>
> It's our basic principle that the function in which may sleep can't be called
> within spinlock hold.
I suppose this principle is a suggestion not a restriction.
>
> On 2017/11/1 9:03, Joseph Qi wrote:
>> Hi Alex,
>>
Hi Alex,
On 2017/11/1 17:52, alex chen wrote:
> Hi Changwei,
>
> Thanks for you reply.
>
> On 2017/11/1 16:28, Changwei Ge wrote:
>> Hi Alex,
>>
>> On 2017/11/1 15:05, alex chen wrote:
>>> Hi Joseph and Changwei,
>>>
>>> It's ou
our patch. :-(
So no DLM_FINALIZE_RECO_MSG message will be sent out to other nodes,
thus DLM_LOCK_RES_RECOVERING can't be cleared.
As I know, if DLM_LOCK_RES_RECOVERING is set, all lock and unlock
requests will be *hang*.
Thanks,
Changwei
>
> thanks,
> Jun
>
> On 2017/11/1 17
On 2017/11/2 11:45, alex chen wrote:
> Hi Changwei,
>
> On 2017/11/1 17:59, Changwei Ge wrote:
>> Hi Alex,
>>
>> On 2017/11/1 17:52, alex chen wrote:
>>> Hi Changwei,
>>>
>>> Thanks for you reply.
>>>
>>> On 2017/11/1 16:28,
Hi Eric,
On 2017/11/7 10:33, Eric Ren wrote:
> Hi,
>
> The testing result against the recent kernel looks good. The attachments are
> overall results. If the detailed logs are needed, please let me know.
>
> Pattern Failed Passed Skipped Total
>
quot;dlm: allow dlm do recovery during shutdown")
>
> Signed-off-by: Jun Piao <piao...@huawei.com>
> Reviewed-by: Alex Chen <alex.c...@huawei.com>
> Reviewed-by: Yiwen Jiang <jiangyi...@huawei.com>
> Acked-by: Changwei Ge <ge.chang...@h3c.com>
Hi
On 2017/12/7 20:37, piaojun wrote:
> Hi Changwei,
>
> On 2017/12/7 19:59, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2017/12/7 19:30, piaojun wrote:
>>> CPUA CPUB
>>>
>>> ocfs2_dentry_c
Hi Jun,
On 2017/12/7 19:30, piaojun wrote:
> CPUA CPUB
>
> ocfs2_dentry_convert_worker
> get 'l_lock'
This lock belongs to ocfs2_dentry_lock::ocfs2_lock_res::l_lock
>
> get 'dentry_attach_lock'
>
>
On 2017/12/8 18:17, piaojun wrote:
> Hi Changwei,
>
> On 2017/12/8 17:09, Changwei Ge wrote:
>> On 2017/12/7 20:37, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2017/12/7 19:59, Changwei Ge wrote:
>>>> Hi Jun,
>>>
Hi Alex,
On 2017/12/5 11:31, alex chen wrote:
> We need to check the free number of the records in each loop to mark
> extent written, because the last extent block may be changed through
> many times marking extent written and the 'num_free_extents' also be
> changed. In the worst case, the
ccess to extent tree with
> ocfs2_dio_end_io_write(), which may cause BUGON in
> ocfs2_get_clusters_nocache()->BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos))
>
> Signed-off-by: Alex Chen <alex.c...@huawei.com>
> Reviewed-by: Jun Piao <piao...@huawei.com>
Acke
Hi Jun,
Thanks for reporting.
I am very interesting in this issue. But, first of all, I want to make
this issue clear, so that I might be able to provide some comments.
On 2017/11/1 9:16, piaojun wrote:
> wait for dlm recovery done when migrating all lockres in case of new
> lockres to be left
On 2017/11/1 9:05, Joseph Qi wrote:
> Hi Alex,
>
> On 17/10/31 20:41, alex chen wrote:
>> In the following situation, the down_write() will be called under
>> the spin_lock(), which may lead a soft lockup:
>> o2hb_region_inc_user
>> spin_lock(_live_lock)
>>o2hb_region_pin
>>
ber of bio we need before calling bio_add_page()?
>
> Thanks
> Jun
>
> On 2018/5/14 11:21, Changwei Ge wrote:
>> Hi Jun,
>>
>> Right now, I am afraid that the easiest and fasted way to fix this issue
>> is to revert your patch.
>>
>> From comm
Hi Jun,
On 2018/5/9 16:50, piaojun wrote:
> Hi Changwei,
>
> On 2018/5/8 23:57, Changwei Ge wrote:
>> Hi Jun,
>>
>> Sorry for this so late reply since I was very busy those days.
>>
>>
>> On 04/16/2018 11:44 AM, piaojun wrote:
>>> Hi Chang
at if slot number exceeds 16.
Thanks,
Changwei
>
> thanks,
> Jun
>
> On 2018/5/9 17:06, Changwei Ge wrote:
>> Hi Jun,
>>
>>
>> On 2018/5/9 16:50, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2018/5/8 23:57, Changwei Ge
Hi Zhonghua and Jospeh,
I'm afraid that it can't be easy to find the offest 0x20 since
::db_signature only occupy 8 bytes.
Thanks,
Changwei
On 2018/5/9 10:50, Guozhonghua wrote:
> Good Idea, I will send patch v2 for review.
>
> Thanks.
>
> Guozhonghua.
>
> -邮件原件-
> 发件人: Joseph Qi
figured out why iocb
> was freed. Though you fix won't bring any side effect, it looks like a
> workaround.
> That means, the freed iocb may still have risk in other place.
>
> Thanks,
> Joseph
>
> On 18/5/8 23:23, Changwei Ge wrote:
>> Hi Gang,
>>
>> I don
Hi Jun,
On 2018/5/9 18:08, piaojun wrote:
> Hi Changwei,
>
> On 2018/4/13 13:51, Changwei Ge wrote:
>> If cluster scale exceeds 16 nodes, bio will be full and bio_add_page()
>> returns 0 when adding pages to bio. Returning -EIO to o2hb_read_slots()
>> from o2h
On 2018/5/10 8:24, piaojun wrote:
>
> On 2018/5/9 20:01, Changwei Ge wrote:
>> Hi Jun,
>>
>>
>> On 2018/5/9 18:08, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2018/4/13 13:51, Changwei Ge wrote:
>>>> If cluster scale exceeds 16
value is zero or not.
Thanks,
Changwei
On 2018/5/10 9:13, Changwei Ge wrote:
>
>
> On 2018/5/10 8:24, piaojun wrote:
>>
>> On 2018/5/9 20:01, Changwei Ge wrote:
>>> Hi Jun,
>>>
>>>
>>> On 2018/5/9 18:08, piaojun wrote:
>>
It looks good to me.
On 2018/5/8 5:46 PM, Guozhonghua wrote:
> Correct the offset comments of the structure ocfs2_dir_block_trailer.
Reviewed-by: Changwei Ge <ge.chang...@h3c.com>
>
> Signed-off-by: guozhonghua <guozhong...@h3c.com>
> ---
> fs/ocfs2/ocfs2_fs.h |4
like a code bug, and 'iocb' should not be freed at this place.
>>> Could this BUG reproduced easily?
>> Actually, it's not easy to be reproduced since IO is much slower than CPU
>> executing instructions. But the logic here is broken, we'd better fix this.
>>
>> Thanks
ks,
Changwei
>
> thanks,
> Jun
>
> On 2018/4/13 13:51, Changwei Ge wrote:
>> If cluster scale exceeds 16 nodes, bio will be full and bio_add_page()
>> returns 0 when adding pages to bio. Returning -EIO to o2hb_read_slots()
>> from o2hb_setup_one_bio() will l
Hi Larry,
It would be better for you to paste the compilation warning and gcc version.
So we can justify if it is deserved to be fixed.
Thanks,
Changwei
On 2018/5/7 11:06 AM, Larry Chen wrote:
> Hi Jun
>
> Yeah,I know your logic is right.
> It's just a compile warning that made me feel
Friendly ping after one month's silence for this patch.
Andrew had picked this patch into -mm tree.
Can anyone help review my patch or give some comments.:-)
Thanks,
Changwei
On 04/10/2018 07:35 PM, Changwei Ge wrote:
> ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to r
Hi Alex,
Are you able to provide a way to reproduce this issue?
I'm very interested in it.
Thanks,
Changwei.
On 2017/10/20 17:08, alex chen wrote:
> The ip_alloc_sem should be taken in ocfs2_get_block() when reading file
> in DIRECT mode to prevent concurrent access to extent tree with
>
Hi Alex,
Thanks for reporting.
I probably get your point. You mean that for a lock resource(say A), it
is used to protect metadata changing among nodes in cluster.
Unfortunately, it was marks as BLOCKED since it was granted with a EX
lock, and the lock can't be unblocked since it has more or
Hi Alex,
On 2017/10/28 10:35, alex chen wrote:
> Hi Changwei,
>
> Thanks for you reply.
>
> On 2017/10/27 18:21, Changwei Ge wrote:
>> Hi Alex,
>>
>> Thanks for reporting.
>> I probably get your point. You mean that for a lock resource(say A), it
>>
On 2017/12/21 14:36, alex chen wrote:
> Hi Joseph,
>
> On 2017/12/21 9:30, Joseph Qi wrote:
>> Hi Alex,
>>
>> On 17/12/21 08:55, alex chen wrote:
>>> In dlm_reset_mleres_owner(), we will lock
>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>> which breaks the spinlock lock
will
post it.
Thanks,
Changwei
On 2017/12/22 20:25, alex chen wrote:
> Hi Changwei,
>
> On 2017/12/22 14:41, Changwei Ge wrote:
>> A crash issue was reported by John.
>>
>> The call trace follows:
>> ocfs2_split_extent+0x1ad3/0x1b40 [ocfs2]
>> ocfs
For Andrew's convenience, resend this patch
Current code assume that ::w_unwritten_list always has only one item on.
This is not right and hard to get understood.
So improve how to count unwritten item.
Reported-by: John Lightsey <j...@nixnuts.net>
Signed-off-by: Changwei Ge <ge.chang..
ace of local slot
at higher priority.
Reported-by: John Lightsey <j...@nixnuts.net>
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/alloc.c | 208 ---
fs/ocfs2/alloc.h | 1 +
fs/ocfs2/aops.c | 6 ++
3 files changed,
Hi John,
Sorry for reply you so late since too busy these days.
Thanks for your contribution for this issue.
Thanks to the reproducer you provided, I have reproduced the crash issue you
reported.
The back trace was found.
ocfs2_mark_extent_written
ocfs2_change_extent_flag
Current code assume that ::w_unwritten_list always has only one item on.
This is not right and hard to get understood.
So improve how to count unwritten item.
Reported-by: John Lightsey <j...@nixnuts.net>
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/aops.c | 4 +
-by: John Lightsey <j...@nixnuts.net>
Signed-off-by: Changwei Ge <ge.chang...@h3c.com>
---
fs/ocfs2/alloc.c | 208 ---
fs/ocfs2/alloc.h | 1 +
fs/ocfs2/aops.c | 6 ++
3 files changed, 205 insertions(+), 10 deletions(-)
dif
On 2018/1/11 11:33, Gang He wrote:
> Hi Changwei,
>
>
>> On 2018/1/11 10:07, Gang He wrote:
>>> Hi Changwei,
>>>
>>>
>>
On 2018/1/10 18:14, Gang He wrote:
> Hi Changwei,
>
>
>> On 2018/1/10 17:05, Gang He wrote:
>>> Hi Changwei,
>>>
>>>
1 - 100 of 182 matches
Mail list logo