Re: [Ocfs2-devel] Any other reading about DLM?

2015-10-21 Thread Eric Ren
Hi Tariq and Joseph, Thanks very much for the information! Eric On 10/22/15 04:09, Tariq Saeed wrote: > Documents/filesystems/dlmfs.txt is a concise description of how to used > dlmfs > without the api, using only fs open/close/remove calls. > Regards > -Tariq > On 10/19/2015

Re: [Ocfs2-devel] ocfs2-test project compile error

2015-11-09 Thread Eric Ren
Hi, I remembered it's solved by the installation of libaio-devel and libaio that darwin has installed. And according to this manual doc http://man7.org/linux/man-pages/man2/io_submit.2.html , it also tell "io_submit" is provided by libaio. That's what I can think of for now, sorry. Thanks,

Re: [Ocfs2-devel] Long io response time doubt

2015-11-12 Thread Eric Ren
Hi Joseph, On 11/12/15 16:00, Joseph Qi wrote: > On 2015/11/12 15:23, Eric Ren wrote: >> Hi Joseph, >> >> Thanks for your reply! There're more details I'd like to ask about ;-) >> >> On 11/12/15 11:05, Joseph Qi wrote: >>> Hi Eric, >>> You re

Re: [Ocfs2-devel] Long io response time doubt

2015-11-11 Thread Eric Ren
Hi Joseph, Thanks for your reply! There're more details I'd like to ask about ;-) On 11/12/15 11:05, Joseph Qi wrote: > Hi Eric, > You reported an issue about sometime io response time may be long. > > From your test case information, I think it was caused by downconvert. From what I learned

Re: [Ocfs2-devel] Long io response time doubt

2015-11-13 Thread Eric Ren
Hi Joseph, > >> 2. ocfs2cmt does periodically commit. > >> > >> One case can lead to long time downconvert is, it is indeed that it has > >> too much work to do. I am not sure if there are any other cases or code > >> bug. > > OK, not familiar with ocfs2cmt. Could I bother you to explain what

[Ocfs2-devel] [PATCH] dlm: make dlm_posix_lock comply with posix file lock semanteme

2015-10-14 Thread Eric Ren
t_killable with wait_event_interruptible can fix this issue. Signed-off-by: Eric Ren <z...@suse.com> Acked-by: David Teigland <teigl...@redhat.com> --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 5532f09..88f1036 10064

[Ocfs2-devel] [PATCH] dlm: make dlm_posix_lock comply with posix file lock semanteme

2015-10-14 Thread Eric Ren
t_killable with wait_event_interruptible can fix this issue. Signed-off-by: Eric Ren <z...@suse.com> Acked-by: David Teigland <teigl...@redhat.com> --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 5532f09..88f1036 10064

[Ocfs2-devel] Any other reading about DLM?

2015-10-19 Thread Eric Ren
Hi Joseph and all, In order to learn about DLM, I got [1][2][3] besides source code in fs/dlm and fs/ocfs2/dlm. Among them, [1][2] are for fs/dlm, and [3] is for fs/ocfs2/dlm. But most parts of [3] is not complete yet. [1] http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf

[Ocfs2-devel] OCFS2 test project current status ANNOUNCEMENT

2015-10-08 Thread Eric Ren
Hi all, Now, we have already collected 55 patches into Mark's github repo[1] since Mar 16, 2012. [1] https://github.com/markfasheh/ocfs2-test Thanks a lot for their contributors: Junxiao Bi: 23 Jeff Liu: 11 Tiger Yang: 12 Goldwynr: 5 Eric Ren: 3 Gang He: 1 This patches

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix deadlock due to nested lock

2015-12-07 Thread Eric Ren
Hi junxiao, On Tue, Dec 08, 2015 at 10:41:03AM +0800, Junxiao Bi wrote: > Hi Eric, > > On 12/07/2015 05:01 PM, Eric Ren wrote: > > Hi Junxiao, > > > > On Mon, Dec 07, 2015 at 02:44:21PM +0800, Junxiao Bi wrote: > >> Hi Eric, > >> > >> On

Re: [Ocfs2-devel] Buffer read will get starvation in case reading/writing the same file from different nodes concurrently

2015-12-08 Thread Eric Ren
Hi Joseph, On Tue, Dec 08, 2015 at 02:15:17PM +0800, joseph wrote: > On 2015/12/8 12:51, Eric Ren wrote: > > Hi, > > > > On Tue, Dec 08, 2015 at 11:55:18AM +0800, joseph wrote: > >> Hi Gang, > >> Eric and I have discussed this case before. > >>

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix deadlock due to nested lock

2015-12-04 Thread Eric Ren
Hi Junxiao, The patch is likely unfair to the blocked lock on remote node(node Y in your case). The original code let the second request to go only if it's compatible with the predicting level we would downconvert for node Y. Considering more extremer situation, there're more acquiring from node

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix deadlock due to nested lock

2015-12-07 Thread Eric Ren
Hi Junxiao, On Mon, Dec 07, 2015 at 02:44:21PM +0800, Junxiao Bi wrote: > Hi Eric, > > On 12/04/2015 06:07 PM, Eric Ren wrote: > > Hi Junxiao, > > > > The patch is likely unfair to the blocked lock on remote node(node Y in > > your case). The original code

Re: [Ocfs2-devel] Buffer read will get starvation in case reading/writing the same file from different nodes concurrently

2015-12-07 Thread Eric Ren
Hi, On Tue, Dec 08, 2015 at 11:55:18AM +0800, joseph wrote: > Hi Gang, > Eric and I have discussed this case before. > Using NONBLOCK here is because there is a lock inversion between inode > lock and page lock. You can refer to the comments of > ocfs2_inode_lock_with_page for details. >

Re: [Ocfs2-devel] Buffer read will get starvation in case reading/writing the same file from different nodes concurrently

2015-12-09 Thread Eric Ren
Hi Joseph, On Wed, Dec 09, 2015 at 02:07:56PM +0800, joseph wrote: > On 2015/12/8 16:26, Eric Ren wrote: > > Hi Joseph, > > On Tue, Dec 08, 2015 at 02:15:17PM +0800, joseph wrote: > >> On 2015/12/8 12:51, Eric Ren wrote: > >>> Hi, > >>> > &

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix recursive locking deadlock

2015-12-14 Thread Eric Ren
Hi, On Mon, Dec 14, 2015 at 05:02:26PM +0800, Junxiao Bi wrote: > On 12/14/2015 04:44 PM, Eric Ren wrote: > > Hi Junxiao, > > > > On Mon, Dec 14, 2015 at 09:57:38AM +0800, Junxiao Bi wrote: > >> The following locking order can cause a deadlock. > >> Proce

Re: [Ocfs2-devel] The root cause analysis about buffer read getting starvation

2015-12-18 Thread Eric Ren
Hi all, On Thu, Dec 17, 2015 at 08:08:42AM -0700, He Gang wrote: > Hello Mark and all, > In the past days, I and Eric were looking at a customer issue, the customer > is complaining that buffer reading sometimes lasts too much time ( 1 - 10 > seconds) in case reading/writing the same file from

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix recursive locking deadlock

2015-12-14 Thread Eric Ren
Hi Junxiao, On Mon, Dec 14, 2015 at 09:57:38AM +0800, Junxiao Bi wrote: > The following locking order can cause a deadlock. > Process A on Node X: Process B on Node Y: > lock_XYZ(PR) > lock_XYZ(EX) > lock_XYZ(PR) >>> blocked forever by

Re: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix recursive locking deadlock

2015-12-14 Thread Eric Ren
Hi, On Mon, Dec 14, 2015 at 02:03:17PM +0800, Junxiao Bi wrote: > On 12/14/2015 01:39 PM, Gang He wrote: > > Hello Junxiao, > > > > From the initial description, the second lock_XYZ(PR) should be blocked, > > since DLM have a fair queue mechanism, otherwise, it looks to bring a > > write

Re: [Ocfs2-devel] [PATCH] ocfs2: Do not lock/unlock() inode DLM lock

2016-01-06 Thread Eric Ren
Hi, On Tue, Dec 29, 2015 at 07:31:16PM -0700, He Gang wrote: > Hello Goldwyn, > > When read path can't get the DLM lock immediately (NONBLOCK way), next get > the lock with BLOCK way, this behavior will cost some time (several msecs). > It looks make sense to delete that two line code. > But

Re: [Ocfs2-devel] The root cause analysis about buffer read getting starvation

2015-12-25 Thread Eric Ren
Hi Mark and Gang, > In the past days, I and Eric were looking at a customer issue, the customer > is complaining that buffer reading sometimes lasts too much time ( 1 - 10 > seconds) in case reading/writing the same file from different nodes > concurrently, some day ago I sent a mail to the

Re: [Ocfs2-devel] ocfs2-test for v4.3 done

2015-12-21 Thread Eric Ren
Hi Junxiao, Thanks for your sharing, and very appreciated your efforts. On Mon, Dec 21, 2015 at 11:00:59AM +0800, Junxiao Bi wrote: > Hi, > > I have run a full ocfs2-test(single/multiple/discontig) to v4.3 mainline > kernel. The following three issues are found. The first two are > regression

Re: [Ocfs2-devel] ocfs2-test for v4.3 done

2015-12-21 Thread Eric Ren
Hi, On Tue, Dec 22, 2015 at 10:34:20AM +0800, Junxiao Bi wrote: > On 12/21/2015 09:50 PM, Eric Ren wrote: > > Hi Junxiao, > > > > Thanks for your sharing, and very appreciated your efforts. > > > > On Mon, Dec 21, 2015 at 11:00:59AM +0800, Junxiao Bi wrot

Re: [Ocfs2-devel] The root cause analysis about buffer read getting starvation

2015-12-21 Thread Eric Ren
Hello Mark, ...snip.. > > SLES10 with kernel version about 2.6.16.x, used blocking way, i.e. > > down_read(), wich has the > > potential deaklock between page lock / ip_alloc_sem when one node get the > > cluster lock and > > does writing and reading on same file on it. This deadlock was fixed

Re: [Ocfs2-devel] Long io response time doubt

2015-11-25 Thread Eric Ren
nconvert. > The code you paste is calling into fs/dlm which I am not familiar with:( > I think you can list your questions and send to cluster-devel. > > Thanks, > Joseph > > On 2015/11/24 18:05, Eric Ren wrote: >> Sorry, forget to add the pieces of code flow... >>

Re: [Ocfs2-devel] [QUESTION] what's lockres P000000000000000000000000000000?

2015-11-30 Thread Eric Ren
Hi Junxiao, On 11/30/2015 09:13 PM, Junxiao Bi wrote: > Hi Eric, > > It’s orphan scan lock. For more lock type and format, please refer to > fs/ocfs2/ocfs2_lockid.h Great! Thanks~ Eric > Thanks, > Junxiao. >> 在 2015年11月30日,下午7:35,Eric Ren <z...@suse.com> 写

[Ocfs2-devel] [QUESTION] what's lockres P000000000000000000000000000000?

2015-11-30 Thread Eric Ren
Hi all, I found there is an odd lockres P00 for me, so I'm curious about what this lockres stands for? 1. #dmesg ... [ 7657.554057] (ocfs2dc,4061,1):ocfs2_process_blocked_lock:3904 lockres P00 blocked [ 7657.554059]

Re: [Ocfs2-devel] v4.3 kernel panic when run ocfs2-test single test

2015-11-18 Thread Eric Ren
Hi Junxiao, On 11/18/15 13:51, Junxiao Bi wrote: > Hi, > > The following kernel panic was saw when run ocfs2-test single test for > v4.3 kernel. Anybody ever saw this? I have not tested ocfs2 with new kernel yet. Which test case crashed, mmaptruncate? Could you please use kdump & crash tools to

Re: [Ocfs2-devel] v4.3 kernel panic when run ocfs2-test single test

2015-11-18 Thread Eric Ren
Hi Junxiao, On 11/19/15 12:44, Junxiao Bi wrote: > Hi Eric, > > On 11/18/2015 05:51 PM, Eric Ren wrote: >> Hi Junxiao, >> >> On 11/18/15 13:51, Junxiao Bi wrote: >>> Hi, >>> >>> The following kernel panic was saw when run ocfs2-test single

Re: [Ocfs2-devel] Long io response time doubt

2015-11-24 Thread Eric Ren
1.272 us| lockres_set_flags(); 0) 1.757 us|} 0) + 13.396 us | } 0) + 13.983 us |} 0) |dlm_put_lkb() { 0) 0.136 us| __put_lkb(); 0) 0.641 us|} 0) + 17.224 us | } Thank, Eric On 11/24/15 18:02, Eric Ren wrote: Hi Joseph, I use

Re: [Ocfs2-devel] Long io response time doubt

2015-11-24 Thread Eric Ren
Hi Joseph, I use ftrace's function tracer to record some code flow. There's a question that makes me confused - why does ocfs2_cancel_convert() be called here in ocfs2dc thread? In other words, what do we expect it to do here? ocfs2_unblock_lock(){ ... if(lockres->l_flags &

[Ocfs2-devel] [PATCH] ocfs2: retry on ENOSPC if sufficient space in truncate log

2016-06-22 Thread Eric Ren
g when ENOSPC is returned. And we cannot reuse the deleted blocks before the transaction committed. Fortunately, we already have a function to do this - ocfs2_try_to_free_truncate_log(). Just need to remove the "static" modifier and put it into a right place. Signed-off-by: Eric

Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert

2016-01-29 Thread Eric Ren
Hello jiufei, On Wed, Jan 27, 2016 at 05:52:05PM +0800, xuejiufei wrote: > We have found a bug when two nodes doing umount one after another. > 1) Node 1 migrate a lockres that has 3 locks in grant queue such as > N2(PR)<->N3(NL)<->N4(PR) to N2. After migration, lvb of the lock N3(NL) > and

Re: [Ocfs2-devel] ocfs2-test for v4.3 done

2016-02-23 Thread Eric Ren
Hi Junxiao, On 02/24/2016 09:48 AM, Junxiao Bi wrote: > Hi Eric, > > On 02/19/2016 11:01 AM, Eric Ren wrote: >> Hi Junxiao, >> >> On Wed, Feb 17, 2016 at 10:15:56AM +0800, Junxiao Bi wrote: >>> Hi Eric, >>> >>> I remember i described it befo

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Eric Ren
( at least 4.4), right? I have found this patch in maillist and it looks good! I'd like to test it right now and give feadback! Thanks again, Eric > > Thanks, > Junxiao. > On 01/20/2016 12:46 AM, Eric Ren wrote: > > This problem was introduced by commit >

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Eric Ren
Hi all, On Thu, Jan 21, 2016 at 03:05:58PM -0800, Andrew Morton wrote: > On Thu, 21 Jan 2016 16:18:38 +0800 Junxiao Bi <junxiao...@oracle.com> wrote: > > > On 01/21/2016 04:10 PM, Eric Ren wrote: > > > Hi Junxiao, > > > > > > On Thu, Jan 2

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-19 Thread Eric Ren
ng cleared and OCFS2_LOCK_BUSY being set, we should do things like that. But is any chance that both OCFS2_LOCK_BUSY and OCFS2_LOCK_BLOCKED are set at the same time? If not, I prefer this one. What do you think? Any comment would be appreciated. Thanks, Eric On Wed, Jan 20, 2016 at 12:46:53AM +0800

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-24 Thread Eric Ren
On Wed, Jan 20, 2016 at 11:13:35AM +0800, Junxiao Bi wrote: > This message is sent to master node when non-master nodes's > negotiate timer expired. Master node records these nodes in > a bitmap which is used to do write timeout timer re-queue > decision. > > Signed-off-by: Junxiao Bi

Re: [Ocfs2-devel] [PATCH 4/6] ocfs2: o2hb: add some user/debug log

2016-01-24 Thread Eric Ren
Hi Junxiao, On Wed, Jan 20, 2016 at 11:13:37AM +0800, Junxiao Bi wrote: > Signed-off-by: Junxiao Bi > Reviewed-by: Ryan Ding > --- > fs/ocfs2/cluster/heartbeat.c | 39 --- > 1 file changed, 32 insertions(+), 7

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-24 Thread Eric Ren
On Mon, Jan 25, 2016 at 12:28:08PM +0800, Junxiao Bi wrote: > On 01/25/2016 11:18 AM, Eric Ren wrote: > >> > >> > @@ -2039,13 +2086,30 @@ static struct config_item > >> > *o2hb_heartbeat_group_make_item(struct config_group *g > >> > > >

Re: [Ocfs2-devel] [PATCH 4/6] ocfs2: o2hb: add some user/debug log

2016-01-24 Thread Eric Ren
Hi Juxiao, On Mon, Jan 25, 2016 at 12:29:05PM +0800, Junxiao Bi wrote: > On 01/25/2016 11:28 AM, Eric Ren wrote: > >> @@ -449,7 +470,11 @@ static int o2hb_nego_timeout_handler(struct o2net_msg > >> *msg, u32 len, void *data, > >> > static int o2hb_nego_appr

[Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-19 Thread Eric Ren
dc thread requeue=yes R1(clear OCFS2_LOCK_UPCONVERT_FINISHING,wait) R2(wait) ... dlmglue deadlock util dc thread woken up by others This fix is to clear OCFS2_LOCK_UPCONVERT_FINISHING util OCFS2_LOCK_BUSY has been cleared and every waiters has been looped. Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/d

Re: [Ocfs2-devel] ocfs2-test for v4.3 done

2016-02-16 Thread Eric Ren
Hi Junxiao, > >> I have setup a test env to build and auto do ocfs2 test. With it, Ocfs2 > >> for mainline and linux-next will be test regularly, the test status and > >> bugs will be reported to ocfs2-devel. Feel free to take any bug if you > >> are interested, it will be a good start point with

Re: [Ocfs2-devel] ocfs2-test for v4.3 done

2016-02-18 Thread Eric Ren
ven if we've built kernel RPM and installed it. Did you have this problem? Any suggestion;-) What I can think of is to try opensuse tumbleweed distribution(a rolling release). > > On 02/16/2016 05:54 PM, Eric Ren wrote: > > Hi Junxiao, > > > >> Four vm are used,

Re: [Ocfs2-devel] [PATCH v4 2/5] ocfs2: sysfile interfaces for online file check

2016-03-08 Thread Eric Ren
On 02/29/2016 01:17 PM, Gang He wrote: > Implement online file check sysfile interfaces, e.g. > how to create the related sysfile according to device name, > how to display/handle file check request from the sysfile. > > Signed-off-by: Gang He <g...@suse.com> Tested-by: Er

Re: [Ocfs2-devel] [PATCH v4 4/5] ocfs2: check/fix inode block for online file check

2016-03-08 Thread Eric Ren
On 02/29/2016 01:18 PM, Gang He wrote: > Implement online check or fix inode block during > reading a inode block to memory. > > Signed-off-by: Gang He <g...@suse.com> Tested-by: Eric Ren <z...@suse.com> > --- >

Re: [Ocfs2-devel] Welcome to join in: an ocfs2 IRC channel is setup for quick communications

2016-04-04 Thread Eric Ren
Hello Goldwyn, On 04/04/2016 07:39 PM, Goldwyn Rodrigues wrote: > Hello Eric, > > On 04/01/2016 08:50 PM, Eric Ren wrote: >> Hello Mark and all, >> >> Last week, I proposed to have an IRC channel for OCFS2 in tools maillist >> [1]. I'm afraid most p

[Ocfs2-devel] Welcome to join in: an ocfs2 IRC channel is setup for quick communications

2016-04-01 Thread Eric Ren
Hello Mark and all, Last week, I proposed to have an IRC channel for OCFS2 in tools maillist [1]. I'm afraid most people probably didn't even notice it. So, I bring up here with a right subject. Really hope it helps: #ocfs2 on freenode, you can login through website[1], or Thunderbird[2].

[Ocfs2-devel] [PATCH] ocfs2: fix a redundant re-initialization

2016-05-22 Thread Eric Ren
Obviously, memset() has zeroed the whole struct locking_max_version. So, it's no need to zero its two fields individually. Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/stackglue.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/ocfs2/stackglue.c b/fs/ocfs2/stackglue.c

[Ocfs2-devel] [PATCH] ocfs2: fix improper handling of return errno

2016-05-22 Thread Eric Ren
Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/inode.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c index ba495be..fee5ec6 100644 --- a/fs/ocfs2/inode.c +++ b/fs/ocfs2/inode.c @@ -176,12 +176,7 @@ struct inode *o

Re: [Ocfs2-devel] Dead lock and cluster blocked, any advices will be appreciated.

2016-05-08 Thread Eric Ren
Hello Zhonghua, Thanks for reporting this. On 05/07/2016 07:30 PM, Guozhonghua wrote: > Hi, we had find one dead lock scenario. > > Suddenly, the Node 2 is rebooted(fenced) for IO error accessing storage. So > its slot 2 is remained valid on storage disk. > The node 1 which is in the same

Re: [Ocfs2-devel] Reflink hangs with kernel 4.4

2016-05-09 Thread Eric Ren
Hello: On 05/09/2016 09:20 PM, 서정우 wrote: Hi all. I built up ocfs2 on drbd dual primary. Each node has 12 disks of Raid 10 with mdadm chuck size 4096k. Cluster size of filesystem is 1048576 bytes. Main purpose of use is reflink files on drbd. I reflinked files from 1TB file and exported

[Ocfs2-devel] [PATCH v2] ocfs2: retry on ENOSPC if sufficient space in truncate log

2016-07-06 Thread Eric Ren
code isn't elegant, but looks no better option. v2: 1. Lock allocator inode again if ocfs2_schedule_truncate_log_flush() fails. -- spotted by Joseph Qi <joseph...@huawei.com> Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/alloc.c| 37 ++

[Ocfs2-devel] [PATCH v3] ocfs2: retry on ENOSPC if sufficient space in truncate log

2016-07-08 Thread Eric Ren
e isn't elegant, but looks no better option. v3: 1. Also need to lock allocator inode when "= 0" is returned from ocfs2_schedule_truncate_log_flush(), which means no space really. -- spotted by Joseph Qi v2: 1. Lock allocator inode again if ocfs2_schedule_truncate_log_flush() fails. -- spo

Re: [Ocfs2-devel] [PATCH] ocfs2: retry on ENOSPC if sufficient space in truncate log

2016-07-06 Thread Eric Ren
Hi Joseph, On 07/06/2016 12:21 PM, Joseph Qi wrote: > NAK, if ocfs2_try_to_free_truncate_log fails, it will lead to double > ocfs2_inode_unlock and then BUG. Thanks for pointing out this! Will fix this and resend. Eric > > On 2016/6/22 17:07, Eric Ren wrote: >> The test

Re: [Ocfs2-devel] [PATCH] A bug in the end of DLM recovery

2016-08-07 Thread Eric Ren
Hi, On 08/06/2016 01:58 PM, Gechangwei wrote: > Hi, > > I found an issue in the end of DLM recovery. What's the detailed steps of reproduction? > When DLM recovery comes to the end of recovery procedure, it will remaster > all locks in other nodes. > Right after a request message is sent to a

Re: [Ocfs2-devel] [PATCH] ocfs2: remove obscure BUG_ON in dlmglue

2016-07-03 Thread Eric Ren
Good catch, thanks! Reviewed-by: Eric Ren <z...@suse.com> On 07/01/2016 05:10 PM, Joseph Qi wrote: > These BUG_ON(!inode) are obscure because we have already used inode to > get osb. And actually we can guarantee here inode is valid in the > context. So we can safely remove them.

Re: [Ocfs2-devel] ocfs2: cleanup implemented prototypes

2016-07-03 Thread Eric Ren
Hi Joseph, Please see comments inline;-) On 07/01/2016 05:27 PM, Joseph Qi wrote: > Several prototypes in inode.h are just defined but not actually > implemented and used, so remove them. > > Signed-off-by: Joseph Qi > --- > fs/ocfs2/inode.h | 7 --- >

Re: [Ocfs2-devel] ocfs2: cleanup implemented prototypes

2016-07-03 Thread Eric Ren
> Joseph > > On 2016/7/4 11:36, Eric Ren wrote: >> Hi Joseph, >> >> Please see comments inline;-) >> >> On 07/01/2016 05:27 PM, Joseph Qi wrote: >>> Several prototypes in inode.h are just defined but not actually >>> implemented and used, so r

Re: [Ocfs2-devel] [PATCH v3] ocfs2/dlm: Optimization of code while free dead node locks.

2017-01-18 Thread Eric Ren
<guozhong...@h3c.com> The patch looks good to me, except some formatting issues: 1. The commit message at (1) should be placed at (2); 2. Change log is still missing; I think it's not a big deal, though. The fix is quite simple. Wish your patch has good formatting next time;-) Reviewed-by:

Re: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: fix deadlock issue when taking inode lock at vfs entry points

2017-01-15 Thread Eric Ren
Hi! On 01/16/2017 02:58 PM, Junxiao Bi wrote: > On 01/16/2017 02:42 PM, Eric Ren wrote: >> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") >> results in a deadlock, as the author "Tariq Saeed" realized shortly >> after the patch

Re: [Ocfs2-devel] [PATCH v3 1/2] ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock

2017-01-16 Thread Eric Ren
Hi! On 01/17/2017 03:39 PM, Joseph Qi wrote: > > On 17/1/17 14:30, Eric Ren wrote: >> We are in the situation that we have to avoid recursive cluster locking, >> but there is no way to check if a cluster lock has been taken by a >> precess already. >> >> Mo

Re: [Ocfs2-devel] [PATCH v3 1/2] ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock

2017-01-17 Thread Eric Ren
Hi! On 01/17/2017 04:43 PM, Joseph Qi wrote: > On 17/1/17 15:55, Eric Ren wrote: >> Hi! >> >> On 01/17/2017 03:39 PM, Joseph Qi wrote: >>> >>> On 17/1/17 14:30, Eric Ren wrote: >>>> We are in the situation that we have to avoid recursive

[Ocfs2-devel] [PATCH v3 2/2] ocfs2: fix deadlock issue when taking inode lock at vfs entry points

2017-01-16 Thread Eric Ren
t ocfs2_setattr() and ocfs2_permission() to catch exceptional cases, suggested by: Junxiao Bi. Changes since v2: - Use new wrappers of tracking logic code, suggested by: Junxiao Bi. Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/acl.c | 29 + fs/ocfs2/file.c | 58 +

[Ocfs2-devel] [PATCH v3 0/2] fix deadlock caused by recursive cluster locking

2017-01-16 Thread Eric Ren
ion() to catch exceptional cases, suggested by: Junxiao Bi. - Do not inline functions whose bodies are not in scope, changed by: Stephen Rothwell <s...@canb.auug.org.au>. Changes since v2: - Use new wrappers of tracking logic code, suggested by: Junxiao Bi. Your comments and feedbacks are a

[Ocfs2-devel] [PATCH v3 1/2] ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock

2017-01-16 Thread Eric Ren
ing into functions, ocfs2_inode_lock_tracker() and ocfs2_inode_unlock_tracker(), suggested by: Junxiao Bi. [s...@canb.auug.org.au remove some inlines] Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/dlmglue.c | 105 +++-- fs/ocfs2/dlmglue.h | 18 +

[Ocfs2-devel] [PATCH v4 1/2] ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock

2017-01-17 Thread Eric Ren
ing into functions, ocfs2_inode_lock_tracker() and ocfs2_inode_unlock_tracker(), suggested by: Junxiao Bi. Change since v3: - Fixes redundant space, spotted by: Joseph Qi. [s...@canb.auug.org.au remove some inlines] Signed-off-by: Eric Ren <z...@suse.com> Reviewed-by: Junxiao Bi <junxiao...@oracle.com&g

[Ocfs2-devel] [PATCH v4 2/2] ocfs2: fix deadlock issue when taking inode lock at vfs entry points

2017-01-17 Thread Eric Ren
t ocfs2_setattr() and ocfs2_permission() to catch exceptional cases, suggested by: Junxiao Bi. Changes since v2: - Use new wrappers of tracking logic code, suggested by: Junxiao Bi. Signed-off-by: Eric Ren <z...@suse.com> Reviewed-by: Junxiao Bi <junxiao...@oracle.

[Ocfs2-devel] [PATCH v4 0/2] fix deadlock caused by recursive cluster locking

2017-01-17 Thread Eric Ren
anb.auug.org.au>. Changes since v2: - Use new wrappers of tracking logic code, suggested by: Junxiao Bi. Change since v3: - Fixes redundant space, spotted by: Joseph Qi. Your comments and feedbacks are always welcomed. Eric Ren (2): ocfs2/dlmglue: prepare tracking logic to avoid recursive cl

[Ocfs2-devel] OCFS2 test report for linux vanilla kernel V4.7.0

2016-08-21 Thread Eric Ren
Hi, The test report below is agaist vanilla kernel v4.7.0. Some highlights: 1. As you can see from logs attached, pcmk stack is used with "blocksize=4096, clustersize=32768"; 2. "inline" testcase on multiple nodes failed 3/3 times so far; seems to be a regression issue; 3. Two cases are

Re: [Ocfs2-devel] OCFS2 test report for linux vanilla kernel V4.8.0-rc2

2016-08-22 Thread Eric Ren
Sorry, actually it's already: 4.8.0-rc2-173-g184ca82-1.gacbdb4b-vanilla Eric On 08/22/2016 10:42 AM, Eric Ren wrote: > Hi, > > > The test report below is agaist vanilla kernel v4.7.0. Some highlights: > > 1. As you can see from logs attached, pcmk stack is used with

Re: [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()

2016-08-30 Thread Eric Ren
cd cd cd cd cd || * 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0933082 s, 11.2 MB/s 0010 > > On 08/30/2016 12:38 AM, Eric Ren wrote: >> Hi, >> >> I'm on 4.8.0-rc3 kernel. Hope someone else can double-confirm this;-) >> >> On 08/30

Re: [Ocfs2-devel] [PATCH] ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

2016-09-11 Thread Eric Ren
;> >> Thanks, >> Joseph >> >>> Fix this issue by unlocking the target page after we fail to allocate >>> enough space at the first time. >>> >>> Jan Kara helps me clear out the JBD2 part, and suggest the hint for root >>> cause. >>

Re: [Ocfs2-devel] [PATCH] ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

2016-09-14 Thread Eric Ren
nks, Eric Thanks, Joseph Fix this issue by unlocking the target page after we fail to allocate enough space at the first time. Jan Kara helps me clear out the JBD2 part, and suggest the hint for root cause. Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/aops.c | 7 +++ 1 file ch

Re: [Ocfs2-devel] [PATCH] ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

2016-09-14 Thread Eric Ren
Hi, On 09/12/2016 11:06 AM, Eric Ren wrote: > Hi, >>> IMO, in ocfs2_grab_pages_for_write, mmap_page is mapping to w_pages and >>> w_target_locked is set to true, and then will be unlocked by >>> ocfs2_unlock_pages in ocfs2_free_write_ctxt. >>> So I'm

Re: [Ocfs2-devel] [PATCH] ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

2016-09-14 Thread Eric Ren
, the mmapped page should be unlocked as long as we cannot return VM_FAULT_LOCKED to do_page_mkpage(). Otherwise, the deadlock will happen in do_page_mkpage(). Please see the recent 2 mails;-) Eric > > Thanks, > Joseph > > On 2016/9/14 16:04, Eric Ren wrote: >> Hi Joseph, >>>&g

Re: [Ocfs2-devel] [PATCH] ocfs2: free the mle while the res had one, to avoid mle memory leak.

2016-09-13 Thread Eric Ren
Hi, On 09/13/2016 03:52 PM, Guozhonghua wrote: > In the function dlm_migrate_request_handler, while the ret is --EEXIST, the > mle should be freed, otherwise the memory will be leaked. Keep your commit comments within 75 or 78 (I don't remember clearly but git will warn if you don't keep its

Re: [Ocfs2-devel] [PATCH] ocfs2: fix double unlock in case retry after free truncate log

2016-09-17 Thread Eric Ren
Hello Joseph, On 09/14/2016 04:13 PM, Joseph Qi wrote: > Hi Eric, > > On 2016/9/14 15:57, Eric Ren wrote: >> Hello Joseph, >> >> Thanks for fixing up this. >> >> On 09/14/2016 12:15 PM, Joseph Qi wrote: >>> If ocfs2_reserve_cluster_bitmap_bits fai

Re: [Ocfs2-devel] [PATCH v2 RESEND] ocfs2: fix double unlock in case retry after free truncate log

2016-09-17 Thread Eric Ren
id return value overwritten issue. > > Fixes: 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in > truncate log" > Signed-off-by: Joseph Qi <joseph...@huawei.com> > Signed-off-by: Jiufei Xue <xuejiu...@huawei.com> LGTM Reviewed-by: Eric Ren <z.

[Ocfs2-devel] [PATCH v2] ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

2016-09-17 Thread Eric Ren
ked target page. These two errors fail on the same path, so fix them by unlocking the target page manually before ocfs2_free_write_ctxt(). Jan Kara helps me clear out the JBD2 part, and suggest the hint for root cause. Changes since v1: 1. Also put ENOMEM error case into consideration. Signed-off-by

Re: [Ocfs2-devel] [PATCH] ocfs2: Fix double put of recount tree in ocfs2_lock_refcount_tree()

2016-09-17 Thread Eric Ren
0x70 > [] ? aio_kernel_free+0xe/0x10 > [] aio_write_iter+0x2e/0x30 > > Fix this by avoiding the second call to ocfs2_refcount_tree_put() > > Signed-off-by: Ashish Samant <ashish.sam...@oracle.com> LGTM Reviewed-by: Eric Ren <z...@suse.com> > --- > fs/ocfs2/r

Re: [Ocfs2-devel] [PATCH] ocfs2: fix undefined struct variable in inode.h

2016-09-21 Thread Eric Ren
<joseph...@huawei.com> LGTM Reviewed-by: Eric Ren<z...@suse.com> --- fs/ocfs2/inode.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h index 50cc550..5af68fc 100644 --- a/fs/ocfs2/inode.h +++ b/fs/ocfs2/inode.h @@ -123,8 +123,6 @@ static i

[Ocfs2-devel] [Question] deadlock on chmod when running discontigous block group multiple node testing

2016-10-10 Thread Eric Ren
Hi Junxiao, As the subject, the testing hung there on a kernel without your patches: "ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang" and "ocfs2: fix posix_acl_create deadlock" The stack trace is: ``` ocfs2cts1:~ # pstree -pl 24133

Re: [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking

2016-11-08 Thread Eric Ren
Hi all, On 10/19/2016 01:19 PM, Eric Ren wrote: diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c index bed1fcb..7e3544e 100644 --- a/fs/ocfs2/acl.c +++ b/fs/ocfs2/acl.c @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle, int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-11-08 Thread Eric Ren
Hi all, On 10/19/2016 01:19 PM, Eric Ren wrote: ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock(). The problem is that the call chain of ocfs2_permission() includes *_acl(). Possibly, there are three solutions I can think of. The first one is to implement the inode

Re: [Ocfs2-devel] [PATCH 6/6] ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features

2016-11-10 Thread Eric Ren
On 11/11/2016 02:20 PM, Darrick J. Wong wrote: > On Fri, Nov 11, 2016 at 01:49:48PM +0800, Eric Ren wrote: >> Hi, >> >> A few issues obvious to me: >> >> On 11/10/2016 06:51 AM, Darrick J. Wong wrote: >>> Connect the new VFS clone_range, copy_range, and

Re: [Ocfs2-devel] [PATCH 6/6] ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features

2016-11-10 Thread Eric Ren
Hi, A few issues obvious to me: On 11/10/2016 06:51 AM, Darrick J. Wong wrote: > Connect the new VFS clone_range, copy_range, and dedupe_range features > to the existing reflink capability of ocfs2. Compared to the existing > ocfs2 reflink ioctl We have to do things a little differently to

Re: [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking

2016-11-10 Thread Eric Ren
Hi, On 11/10/2016 06:49 PM, piaojun wrote: > Hi Eric, > > On 2016-11-1 9:45, Eric Ren wrote: >> Hi, >> >> On 10/31/2016 06:55 PM, piaojun wrote: >>> Hi Eric, >>> >>> On 2016-10-19 13:19, Eric Ren wrote: >>>> The deadlock

[Ocfs2-devel] what is g_f_a_w_n() short for? thanks

2016-11-07 Thread Eric Ren
Hello Mark, There is a piece of comment that confused me, please correct me: https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L2274 ``` ocfs2_file_write_iter() { ... /* * deep in g_f_a_w_n()->ocfs2_direct_IO we pass in a ocfs2_dio_end_io * function pointer which

[Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-18 Thread Eric Ren
Hi all! Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") results in another deadlock as we have discussed in the recent thread: https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html Before this one, a similiar deadlock has been fixed by Junxiao:

[Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock

2016-10-18 Thread Eric Ren
help debug cluster locking issue. Unfortunately, this may incur some performance lost. Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/dlmglue.c | 60 ++ fs/ocfs2/dlmglue.h | 13 fs/ocfs2/ocfs2.h | 1 + 3 files chang

[Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking

2016-10-18 Thread Eric Ren
EX request comes between two ocfs2_inode_lock() Fix by checking if the cluster lock has been acquired aready in the call-chain path. Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <z...@suse.com> --- fs/ocfs2/acl.c | 39 ++

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-19 Thread Eric Ren
Hi Junxiao, On 10/19/2016 02:57 PM, Junxiao Bi wrote: > I had ever implemented generic recursive locking support, please check the > patch at > https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html > , >

Re: [Ocfs2-devel] [Question] deadlock on chmod when running discontigous block group multiple node testing

2016-10-14 Thread Eric Ren
nks, Eric On 10/12/2016 09:23 AM, Eric Ren wrote: > Hi Junxiao, > >> Hi Eric, >> >> On 10/11/2016 10:42 AM, Eric Ren wrote: >>> Hi Junxiao, >>> >>> As the subject, the testing hung there on a kernel without your patches: >>> >>> &

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-24 Thread Eric Ren
Hi all, On 10/19/2016 01:19 PM, Eric Ren wrote: > The thrid one is to revert that problematic commit! It looks like > get/set_acl() > are always been called by other vfs callback like ocfs2_permission(). I think > we can do this if it's true, right? Anyway, I'll try to work out if

Re: [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking

2016-11-14 Thread Eric Ren
Hi, On 11/14/2016 01:42 PM, piaojun wrote: > Hi Eric, > > > OCFS2_LOCK_BLOCKED flag of this lockres is set in BAST > (ocfs2_generic_handle_bast) when downconvert is needed > on behalf of remote lock request. > > The recursive cluster lock (the second one) will be blocked in >

Re: [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking

2016-11-14 Thread Eric Ren
Hi, > Thanks for your attention. Actually, I tried different versions of draft > patch locally. > Either of them can satisfy myself so far. Sorry, I meat "neither of them". Eric > Some rules I'd like to follow: > 1) check and avoid recursive cluster locking, rather than allow it which >

[Ocfs2-devel] [Bug Report] multiple node reflink: kernel BUG at ../fs/ocfs2/suballoc.c:1989!

2016-11-23 Thread Eric Ren
Hi all, FYI, Reflink testcase in multiple nodes mode failed with the backtrace below: --- 2016-11-02T16:43:41.862247+08:00 ocfs2cts2 kernel: [25429.622914] [ cut here ] 2016-11-02T16:43:41.862273+08:00 ocfs2cts2 kernel: [25429.622979] kernel BUG at

Re: [Ocfs2-devel] [PATCH] ocfs2: Optimization of code while free dead locks, changed for reviews.

2016-11-28 Thread Eric Ren
Hi, I am tired telling you things about patch format... won't do any response until you really model after a correct patch. Eric On 11/28/2016 05:05 PM, Guozhonghua wrote: > Changed the free order and code styles with reviews. Based on Linux-4.9-rc6. > Thanks. > > Signed-off-by: guozhonghua

Re: [Ocfs2-devel] ocfs2: fix sparse file & data ordering issue in direct io

2016-11-16 Thread Eric Ren
Hi, On 11/16/2016 06:45 PM, Dan Carpenter wrote: > On Wed, Nov 16, 2016 at 10:33:49AM +0800, Eric Ren wrote: > That silences the warning, of course, but I feel like the code is buggy. > How do we know that we don't hit that exit path? Sorry, I missed your point. Do you mean the below? &

  1   2   >