Re: [Ocfs2-devel] [RFC] Doubt about dlm_worker

2015-09-10 Thread Sunil Mushran
Sure. It will need to be tested appropriately. On Thu, Sep 10, 2015 at 4:49 AM, Joseph Qi wrote: > Hi Junxiao & Sunil, > Your comments would be appreciated. > > Thanks, > Joseph > > On 2015/9/6 21:11, Joseph Qi wrote: > > Comments for dlm_dispatch_work is described below:

Re: [Ocfs2-devel] [Ocfs2-users] size increase

2015-03-17 Thread Sunil Mushran
This is because you are specifying a 128k cluster size. Refer to man mkfs.ocfs2 for more. On Mar 17, 2015 8:04 PM, Umarzuki Mochlis umarz...@gmail.com wrote: Hi, What I meant by total size is output of 'du -hs' I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical volume of

Re: [Ocfs2-devel] [Ocfs2-users] How to unlock a bloked resource? Thanks

2014-09-10 Thread Sunil Mushran
What is the output of the commands? The protocol is supposed to do the unlocking on its own. See what is it blocked on. It could be that the node that has the lock cannot unlock it because it cannot flush the journal to disk. On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua guozhong...@h3c.com wrote:

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-27 Thread Sunil Mushran
or not. Sunil On Tue, Aug 26, 2014 at 6:57 PM, Xue jiufei xuejiu...@huawei.com wrote: Hi, Sunil On 2014/8/26 1:13, Sunil Mushran wrote: On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi joseph...@huawei.com mailto:joseph...@huawei.com wrote: On 2014/8/25 13:45, Sunil Mushran wrote

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-25 Thread Sunil Mushran
On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi joseph...@huawei.com wrote: On 2014/8/25 13:45, Sunil Mushran wrote: Please could you expand on that. In our scenario, one node can mount multiple volumes across the cluster. For instance, N1 has mounted ocfs2 volumes say volume1, volume2

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-24 Thread Sunil Mushran
Functions in dlmdomain.c are only triggered during mount. So they cannot trigger the deadlock as described above in this thread. I would leave them as is. On Aug 24, 2014 7:06 PM, Xue jiufei xuejiu...@huawei.com wrote: Hi Sunil, On 2014/8/23 1:08, Sunil Mushran wrote: Allocs made via GFP_NOFS

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-24 Thread Sunil Mushran
Please could you expand on that. On Aug 24, 2014 10:42 PM, Joseph Qi joseph...@huawei.com wrote: On 2014/8/25 13:00, Sunil Mushran wrote: Functions in dlmdomain.c are only triggered during mount. So they cannot trigger the deadlock as described above in this thread. I would leave them

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-22 Thread Sunil Mushran
Allocs made via GFP_NOFS, by definition, should not trigger any reclaim from the fs. So this situation should never arise. That's why all allocs in the dlm have NOFS. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com

Re: [Ocfs2-devel] [PATCH] Remove versioning information

2013-11-26 Thread Sunil Mushran
You may want to do the same for the version file in dlm, dlmfs, etc. On Tue, Nov 26, 2013 at 11:28 AM, Goldwyn Rodrigues rgold...@suse.dewrote: The versioning information is confusing for end-users. The numbers are stuck at 1.5.0 when the tools have moved to 1.8.3. I suggest removing the

Re: [Ocfs2-devel] [PATCH] Remove versioning information

2013-11-26 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com You may want to re-add removed MODULE_DESCRIPTION with a short blurb in some existing files. On Tue, Nov 26, 2013 at 3:37 PM, Goldwyn Rodrigues rgold...@suse.de wrote: The versioning information is confusing for end-users. The numbers are stuck

Re: [Ocfs2-devel] [PATCH] ocfs2: force clean refmap when doing local recovery cleanup

2013-08-01 Thread Sunil Mushran
I see no need for a separate function. Just do } else if (res-owner == DLM_LOCK_RES_OWNER_UNKNOWN) { if (test_bit(node, res-refmap)) dlm_lockres_clear_refmap_bit(dlm, res, node); } On Thu, Aug 1, 2013 at 5:05 AM, Xue jiufei xuejiu...@huawei.com wrote: Function

Re: [Ocfs2-devel] Heart beat source code review and test, founding it may be not correct. Is the changes OK, requesting reviews and advices.

2013-07-31 Thread Sunil Mushran
What's the reasoning behind this patch? On Jul 31, 2013, at 3:51 AM, Guozhonghua guozhong...@h3c.com wrote: Hi, I find some code may be not correct as reviewing the heart beat code and test that. The heart beat writing onto disk. I have another question that why not encapsulate the

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmlock_master should return DLM_NORMAL after adding lock to blocked list

2013-06-28 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Fri, Jun 28, 2013 at 1:47 PM, Andrew Morton a...@linux-foundation.orgwrote: On Sun, 23 Jun 2013 18:39:16 +0800 Jeff Liu jeff@oracle.com wrote: Hi Jiufei, On 06/20/2013 07:13 PM, Xue jiufei wrote: Function dlmlock_master

Re: [Ocfs2-devel] [PATCH] ocfs2: should call ocfs2_journal_access_di() before ocfs2_delete_entry() in ocfs2_orphan_del()

2013-06-28 Thread Sunil Mushran
NAK. Current code looks ok. On Fri, Jun 28, 2013 at 1:49 PM, Andrew Morton a...@linux-foundation.orgwrote: Folks, 3.10 is nigh. Could we please have some review and test of this patch? From: Younger Liu younger@huawei.com Subject: ocfs2: should call ocfs2_journal_access_di() before

Re: [Ocfs2-devel] [PATCH] ocfs2: llseek requires to ocfs2 inode lock for the file in SEEK_END

2013-06-26 Thread Sunil Mushran
AFAIR, this behavior has been there since day 1 and changing it will impact performance negatively. I would recommend against making this change for one app. On Wed, Jun 26, 2013 at 6:50 PM, shencanquan shencanq...@huawei.com wrote: On 2013/6/27 9:25, Andrew Morton wrote: On Thu, 27 Jun

Re: [Ocfs2-devel] [PATCH] ret should be int instead of enum in dlm_request_all_locks

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Wed, May 22, 2013 at 8:50 AM, Joseph Qi joseph...@huawei.com wrote: In dlm_request_all_locks, ret is type enum. But o2net_send_message returns a type int value. Then it will never run into the following error branch. So we should change

Re: [Ocfs2-devel] [PATCH] clean up duplicate declaration in dlmrecovery.c

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Mon, May 20, 2013 at 2:36 AM, Joseph Qi joseph...@huawei.com wrote: Below 3 functions have already been declared in dlmcommon.h, so we have no need to declare them again in dlmrecovery.c. dlm_complete_recovery_thread

Re: [Ocfs2-devel] [PATCH] ocfs2_prep_new_orphaned_file should return ret

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Tue, May 21, 2013 at 7:44 PM, shencanquan shencanq...@huawei.com wrote: On 2013/5/22 10:38, xiaowei.hu wrote: if there is error happen in , for example EIO in __ocfs2_prepare_orphan_dir, ocfs2_prep_new_orphaned_file will release

Re: [Ocfs2-devel] [PATCH] Remove unecessary ERROR when removing non-empty directory

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Mon, May 20, 2013 at 8:06 AM, Goldwyn Rodrigues rgold...@gmail.comwrote: While removing a non-empty directory, the kernel dumps a message: (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39 Suppress the error message from being printed

Re: [Ocfs2-devel] [PATCH v2] ocfs2: goto out_unlock if ocfs2_get_clusters_nocache failed in ocfs2_fiemap

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Tue, May 14, 2013 at 12:08 AM, Joseph Qi joseph...@huawei.com wrote: Last time we found there is a lock/unlock bug in ocfs2_file_aio_write, and then we did a thoroughly search for all lock resources in ocfs2_inode_info, including rw, inode

Re: [Ocfs2-devel] ocfs2: Question for ocfs2_recovery_thread

2013-05-18 Thread Sunil Mushran
The first node that gets the lock will do the actual recovery. The others will get the lock and see a clean journal and skip the recovery. A thread should never error out if it fails to get the lock. It should try and try again. On May 17, 2013, at 11:27 PM, Joseph Qi joseph...@huawei.com

Re: [Ocfs2-devel] Patch request reviews, for node reconnecting with other nodes whose node number is little than local, thanks a lot.

2013-05-09 Thread Sunil Mushran
Resending as my reply bounced. On Thu, May 9, 2013 at 10:01 AM, Sunil Mushran sunil.mush...@gmail.comwrote: A better fix is to _not_ disconnect on o2net timeout once a connection has been cleanly established. Only disconnect on o2hb timeout. The reconnects are a problem as we could lose

Re: [Ocfs2-devel] [PATCH] ocfs2: unlock rw lock if inode lock failed

2013-05-06 Thread Sunil Mushran
Looks good to me. Acked-by: Sunil Mushran sunil.mush...@gmail.com On Mon, May 6, 2013 at 7:43 AM, Joseph Qi joseph...@huawei.com wrote: In ocfs2_file_aio_write, it does ocfs2_rw_lock first and then ocfs2_inode_lock. But if ocfs2_inode_lock failed, it goes to out_sems without unlocking rw

Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix possible memory leak in dlm_process_recovery_data

2013-05-02 Thread Sunil Mushran
Do you know under what conditions does it create a new lock when it should not? This code should only trigger if the lockres is/was mastered on another node. Meaning this node will not know about the newlock. Meaning that code should never trigger. 1949 if (lock-ml.cookie

Re: [Ocfs2-devel] [PATCH] mkfs.ocfs2 null pointer dereference. -- resend

2012-12-04 Thread Sunil Mushran
NAK. hb_task is a local variable that is not even accessed after kthread_stop(). The oops is in kthread_stop(). Points to a problem with get/put in task_struct. Not an ocfs2 issue. On Mon, Dec 3, 2012 at 7:18 PM, xiaowei...@oracle.com wrote: From: Xiaowei.Hu xiaowei...@oracle.com Pid:

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-23 Thread Sunil Mushran
On Wed, Aug 22, 2012 at 9:01 PM, Jie Liu jeff@oracle.com wrote: BTW, Sunil mentioned there already has an IO priority patch set but not yet merged. However, I only searched an old posts back to 2006 at: http://www.digipedia.pl/usenet/thread/11947/7120/ Am I missing something? No, I

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-23 Thread Sunil Mushran
On Wed, Aug 22, 2012 at 8:44 PM, Tao Ma t...@tao.ma wrote: I guess the final solution will be WRITE_FUA, and I see btrfs uses it to write out the superblock. It will be handled differently by the underlying block layer so that it will not be in the elevator queue. It should work but I am not

Re: [Ocfs2-devel] [PATCH 2/4] ocfs2: s/o2hb_hearbeat_xxx/o2hb_heartbeat_xxx/g at heartbeat.c

2012-08-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Wed, Aug 22, 2012 at 2:38 AM, Jeff Liu jeff@oracle.com wrote: Not sure if this patch does make sense or not, but it could make the signature of those routines in a consistent manner with others for heartbeating. CC: Sunil Mushran

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-22 Thread Sunil Mushran
Yes. WRITE_SYNC should be good. Not FUA. Also, you may want to look into using io priorities. The code is all there. Just needs activation. On Wed, Aug 22, 2012 at 10:13 AM, srinivas eeda srinivas.e...@oracle.comwrote: On 8/22/2012 7:17 AM, Jie Liu wrote: Hi All, These days, I am

Re: [Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list

2012-08-15 Thread Sunil Mushran
On Tue, Aug 14, 2012 at 11:28 PM, Xue jiufei xuejiu...@huawei.com wrote: Sorry, I haven't described it clearly. We trigger the BUG() in dlmrecovery.c:1923. Lockres had copyed lvb from previous valid locks and then meet with another lock with the EX level. 1907

Re: [Ocfs2-devel] ocfs2/cluster: Clean up messages in o2net

2012-08-15 Thread Sunil Mushran
30, 2011 at 02:14:04PM -0700, Sunil Mushran wrote: Thanks. I'll fix the two. On 08/25/2011 06:01 PM, Dan Carpenter wrote: Hello Sunil Mushran, 1dfecf810e0e: ocfs2/cluster: Clean up messages in o2net Leads to the following Smatch complaint: fs/ocfs2/cluster/tcp.c +1704

Re: [Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list

2012-08-14 Thread Sunil Mushran
On Mon, Aug 13, 2012 at 7:03 PM, Xue jiufei xuejiu...@huawei.com wrote: A parallel umount on 4 nodes triggered a bug in dlm_process_recovery_date(). Here’s the situation: Receiving MIG_LOCKRES message, A node processes the locks in migratable lockres. It copys lvb from migratable lockres

Re: [Ocfs2-devel] [PATCH] ocfs2: delay migration when the lockres is in migration state

2012-08-14 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@gmail.com On Mon, Aug 13, 2012 at 7:06 PM, Xue jiufei xuejiu...@huawei.com wrote: We trigger a bug in __dlm_lockres_reserve_ast() when we parallel umount 4 nodes. The situation is as follows: 1) Node A migrate all lockres it owned(eg. lockres

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-08-03 Thread Sunil Mushran
Thanks for your help. On Fri, Aug 3, 2012 at 12:22 AM, Vincent ETIENEN v...@vetienne.net wrote: Le 02/08/2012 23:08, Sunil Mushran a écrit : On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE v...@vetienne.net wrote: Hi based on current git ( commit

[Ocfs2-devel] [PATCH] ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path

2012-08-03 Thread sunil . mushran
From: Sunil Mushran smush...@yahoo.com Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init. Reported-and-Tested-by: Vincent Etienne vetie...@aprogsys.com Signed-off-by: Sunil Mushran sunil.mush...@gmail.com --- fs/ocfs2/symlink.c |2 +- 1 files changed, 1 insertions(+), 1

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-08-02 Thread Sunil Mushran
On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE v...@vetienne.net wrote: Hi based on current git ( commit 1a9b4993b70fb1884716902774dc9025b457760d ) and reverting commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 Author: Al Viro

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-07-30 Thread Sunil Mushran
The fallocate() oops is probably the same that is fixed by this patch. https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=commit;h=a2118b301104a24381b414bc93371d666fe8d43a Is in the list of patches that are ready to be pushed.

Re: [Ocfs2-devel] [patch] ocfs2/dlm: use GFP_ATOMIC inside a spin_lock

2012-07-30 Thread Sunil Mushran
On Fri, Jul 27, 2012 at 1:32 PM, Mark Fasheh mfas...@suse.de wrote: On Thu, Jul 26, 2012 at 04:05:05PM +0300, Dan Carpenter wrote: My static checker complains that this is called with a spin_lock held in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason we have not

Re: [Ocfs2-devel] [PATCH] ocfs2: break useless while loop

2012-07-19 Thread Sunil Mushran
On Wed, Jul 11, 2012 at 1:51 AM, Joel Becker jl...@evilplan.org wrote: On Wed, Jul 11, 2012 at 02:49:56PM +0800, Junxiao Bi wrote: Signed-off-by: Junxiao Bi junxiao...@oracle.com --- fs/ocfs2/dlm/dlmmaster.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git

Re: [Ocfs2-devel] [GIT PULL] ocfs2 fixes for 3.5-rc5

2012-07-17 Thread Sunil Mushran
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=mw-3.4-mar15 I had prepared some patches sometime ago that could be pushed to mainline. Though some patches may need to be removed as they look to be in this list. On Fri, Jul 6, 2012 at 12:44 AM, Joel Becker jl...@evilplan.org

Re: [Ocfs2-devel] [PATCH] ocfs2: fix dlm lock migration crash

2012-07-17 Thread Sunil Mushran
On Tue, Jul 17, 2012 at 12:10 AM, Junxiao Bi junxiao...@oracle.com wrote: In the target node of the dlm lock migration, the logic to find the local dlm lock is wrong, it shouldn't change the loop variable lock in the list_for_each_entry loop. This will cause a NULL-pointer accessing crash.

Re: [Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery

2012-05-30 Thread Sunil Mushran
On Tue, May 29, 2012 at 5:41 PM, Xiaowei xiaowei...@oracle.com wrote: On 05/30/2012 06:09 AM, Sunil Mushran wrote: I would suggest exploring adding this in dlm hb down event. Checking live map all over the place is hacky. We do it more than we should right now. Let's not add to the mess

Re: [Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery

2012-05-29 Thread Sunil Mushran
On Thu, May 24, 2012 at 10:53 PM, xiaowei...@oracle.com wrote: diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 01ebfd0..62659e8 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -555,6 +555,7 @@ static int dlm_remaster_locks(struct

[Ocfs2-devel] [PATCH 9/9] ocfs2: Fix tiny race in unaligned aio+dio

2012-03-01 Thread Sunil Mushran
with the serialization accounting for writes. This patch seperates the handler functions to avoid this issue. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/aops.c | 44 1 files changed, 32 insertions(+), 12 deletions(-) diff --git a/fs/ocfs2

[Ocfs2-devel] [PATCH 5/9] ocfs2/dlm: Fix list traversal in dlm_process_recovery_data()

2012-03-01 Thread Sunil Mushran
averting the case in which lock is set to NULL. Reported-by: Julia Lawall ju...@diku.dk Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmrecovery.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm

[Ocfs2-devel] [PATCH 3/9] ocfs2: Silence message in ocfs2_global_read_info()

2012-03-01 Thread Sunil Mushran
This patch silences this message. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/quota_global.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index b24eab3..2a3f12c 100644 --- a/fs/ocfs2

[Ocfs2-devel] [PATCH 7/9] ocfs2: Fix oops in fallocate()

2012-03-01 Thread Sunil Mushran
fallocate() was oopsing on ocfs2 because we were passing in a NULL file pointer. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/file.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 061591a..8f30e74 100644

[Ocfs2-devel] [PATCH 8/9] ocfs2: Replace nlink_t with unsigned int

2012-03-01 Thread Sunil Mushran
nlink_t was replaced as per the suggestion in the following link. https://lkml.org/lkml/2012/2/2/577 Reported-by: Al Viro v...@zeniv.linux.org.uk Signed-of-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/namei.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git

[Ocfs2-devel] [PATCH 4/9] ocfs2/dlm: Use dlm-track_lock when adding resource to the tracking list

2012-03-01 Thread Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock to protect operations on dlm-tracking_list. But it was still using the older lock (dlm-spin_lock) to add new resources to the list. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmmaster.c |4

[Ocfs2-devel] [PATCH 2/9] ocfs2: Add missing copyright in few files

2012-03-01 Thread Sunil Mushran
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/mmap.h | 18 ++ fs/ocfs2/ocfs2_trace.h | 19 +++ fs/ocfs2/quota.h| 16 ++-- fs/ocfs2/quota_global.c | 20 ++-- fs/ocfs2/quota_local.c | 19

[Ocfs2-devel] [PATCH 6/9] ocfs2: Tighten free bit calculation in the global bitmap

2012-03-01 Thread Sunil Mushran
exceeds the total bit count. In each instance the bitmap is correct. Only the free bit count is incorrect. This patch checks the current bit value and increments the free bit count only if the bit was previously set. It also prints information to allow us to debug further. Signed-off-by: Sunil Mushran

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
bast queued and flushed,before the ast was queued Unlikely with o2dlm. dlmthread always sends ASTs before BASTs. Can you recreate the entire lockres? A full dump may yield more information. Sunil On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote: I am trying to fix bug13611997,CT's

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
Moreover what is lockres_clear_pending doing in 1.4. That code is not meant for 1.4. It fixes a problem associated with fsdlm. It was left out of 1.4 for a reason. Meaning this bug was introduced by the patch that introduced this one in 1.4. On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote:

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
hold spin locks, then it start to execute the proxy ast handler , process bast request from nodeB, then dlmthread flushed the bast, after this node A start to queue its ast in ocfs2_dlm_lock() function. Thanks, Xiaowei On 02/22/2012 01:48 AM, Sunil Mushran wrote: bast queued and flushed

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
= 0x8105d2a284a8 }, l_lock_num_prmode = 0, l_lock_num_exmode = 0, l_lock_num_prmode_failed = 0, l_lock_num_exmode_failed = 0, l_lock_total_prmode = 0, l_lock_total_exmode = 0, l_lock_max_prmode = 0, l_lock_max_exmode = 0, l_lock_refresh = 0 } On 02/22/2012 08:45 AM, Sunil Mushran wrote: Both AST

Re: [Ocfs2-devel] [patch] ocfs2: cleanup error handling in o2hb_alloc_hb_set()

2012-02-13 Thread Sunil Mushran
hmm... I would say NAK because config_group_item_type_name() could change in the future. And there is nothing wrong with the current code. On 02/13/2012 05:50 AM, Dan Carpenter wrote: If ret is NULL, then hs is also NULL, so there is no need to free it. config_group_init_type_name() can't fail

Re: [Ocfs2-devel] [patch] ocfs2: cleanup error handling in o2hb_alloc_hb_set()

2012-02-13 Thread Sunil Mushran
On 02/13/2012 12:29 PM, Dan Carpenter wrote: On Mon, Feb 13, 2012 at 12:04:09PM -0800, Joel Becker wrote: On Mon, Feb 13, 2012 at 04:50:47PM +0300, Dan Carpenter wrote: If ret is NULL, then hs is also NULL, so there is no need to free it. config_group_init_type_name() can't fail if the name

Re: [Ocfs2-devel] [PATCH] ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.

2012-02-09 Thread Sunil Mushran
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com On 02/08/2012 10:42 PM, Jeff Liu wrote: Hello, Since ENXIO only means offset beyond EOF for SEEK_DATA/SEEK_HOLE, Hence we should return the internal error unchanged if ocfs2_inode_lock() or ocfs2_get_clusters_nocache() call failed rather

Re: [Ocfs2-devel] [PATCH 1/1] ocfs2: use spinlock irqsave for downconvert lock.patch

2012-01-31 Thread Sunil Mushran
sob On 01/30/2012 09:51 PM, Srinivas Eeda wrote: When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread. Below is the stack snippet. The patch disables interrupts when acquiring dc_task_lock

[Ocfs2-devel] [PATCH 1/1] ocfs2: Fix oops in fallocate()

2012-01-30 Thread Sunil Mushran
fallocate() was oopsing on ocfs2 because we were passing in a NULL file pointer. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/file.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 061591a..8f30e74 100644

Re: [Ocfs2-devel] [PATCH 1/1] ocfs2: use spinlock irqsave for downconvert lock.patch

2012-01-30 Thread Sunil Mushran
Comments inlined. On 01/28/2012 06:13 PM, Srinivas Eeda wrote: When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ for I/O completion it deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread The patch disables interrupts when acquiring dc_task_lock

Re: [Ocfs2-devel] Question about incorrect free bits setting

2012-01-18 Thread Sunil Mushran
We've seen this too. The problem happens because of the patch added to delay dropping of the dentry locks (first patch below). The other two are related. It was added to avoid a deadlock in quotas but adds problems of its own. Srini has studied this issue and may be able to expand on this. The

[Ocfs2-devel] pull request

2012-01-13 Thread Sunil Mushran
Joel, Please pull 6 patches (bug fixes) from the following repo. git://oss.oracle.com/git/smushran/linux-2.6.git mw-3.3-jan13 BTW, not sure if I emailed before but we have to rollback 3 patches related to deletes. These patches were added to fix deadlocks with quotas. Well, it has just broken

Re: [Ocfs2-devel] [PATCH][ocfs2/dlm] fix dlm_clean_master_list

2011-12-22 Thread Sunil Mushran
On 12/20/2011 06:49 PM, Wengang Wang wrote: This is a fix on dlm_clean_master_list() During the hash table browsing, we remove mle from hash table then free the memory on the last reference. So we have to use a _safe() version of the browsing function when doing that. This fixes Orabug

Re: [Ocfs2-devel] [PATCH] ocfs2: submit disk heartbeat bio using WRITE_SYNC

2011-12-08 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@oracle.com On 12/05/2011 09:18 PM, Tao Ma wrote: On 12/06/2011 12:57 PM, Noboru Iwamatsu wrote: Under heavy I/O load, writing the disk heartbeat can be forced to wait for minutes, and this causes the node to be fenced. This patch tries to use WRITE_SYNC

[Ocfs2-devel] [PATCH 2/6] ocfs2: Add missing copyright in few files

2011-11-17 Thread Sunil Mushran
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/mmap.h | 18 ++ fs/ocfs2/ocfs2_trace.h | 19 +++ fs/ocfs2/quota.h| 16 ++-- fs/ocfs2/quota_global.c | 20 ++-- fs/ocfs2/quota_local.c | 19

[Ocfs2-devel] [PATCH 4/6] ocfs2/dlm: Use track_lock when manipulating tracking_list

2011-11-17 Thread Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock to protect operations on dlm-tracking_list. But it was still using the older lock (dlm-spin_lock) to add new resources to the list. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmmaster.c |4

[Ocfs2-devel] [PATCH 5/6] ocfs2/dlm: Fix list traversal in dlm_process_recovery_data

2011-11-17 Thread Sunil Mushran
averting the case in which lock is set to NULL. Reported-by: Julia Lawall ju...@diku.dk Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmrecovery.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm

[Ocfs2-devel] [PATCH 1/6] ocfs2/cluster: Fix possible null pointer dereference

2011-11-17 Thread Sunil Mushran
Patch fixes some possible null pointer dereferences that were detected by the static code analyser, smatch. Reported-by: Dan Carpenter erro...@gmail.com Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/cluster/tcp.c | 10 +- 1 files changed, 5 insertions(+), 5

[Ocfs2-devel] [PATCH 3/6] ocfs2: Silence message in ocfs2_global_read_info()

2011-11-17 Thread Sunil Mushran
This patch silences this message. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/quota_global.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index b24eab3..2a3f12c 100644 --- a/fs/ocfs2

[Ocfs2-devel] [PATCH 6/6] ocfs2: Tighten free bit calculation in the global bitmap

2011-11-17 Thread Sunil Mushran
exceeds the total bit count. In each instance the bitmap is correct. Only the free bit count is incorrect. This patch checks the current bit value and increments the free bit count only if the bit was previously set. It also prints information to allow us to debug further. Signed-off-by: Sunil Mushran

Re: [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
fstype is a handy way to format the volume with parameters that are thought to be useful for that use-case. The result of this is printed during format by way of the parameters selected. man mkfs.ocfs2 has a blurb about the features it enabled by default. On 11/16/2011 08:45 AM, Artur Baruchi

Re: [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
Yes. But this is just the features. It also selects the appropriate cluster size, block size, journal size, etc. All the params selected are printed by mkfs. You also have the option of running with the --dry-option to see the params. On 11/16/2011 09:41 AM, Artur Baruchi wrote: I just found

Re: [Ocfs2-devel] [PATCH 1/2] fs/ocfs2/dlm: Eliminate update of list_for_each_entry loop cursor

2011-11-02 Thread Sunil Mushran
I think it got lost in the shuffle. We had decided to use the list_for_each(). The code is simpler to understand than the other proposed fix. Joel, do you want me to send a patch? On 11/02/2011 12:39 AM, Dan Carpenter wrote: What ever happened with this? The bug is still there in the latest

[Ocfs2-devel] Fwd: [PATCH] ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2

2011-10-19 Thread Sunil Mushran
Joel, Please add this to the linux-next branch. Original Message Subject:[Ocfs2-devel] [PATCH] ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2 Date: Wed, 19 Oct 2011 09:34:19 +0800 From: xiaowei...@oracle.com To: ocfs2-devel@oss.oracle.com CC:

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-14 Thread Sunil Mushran
On 10/14/2011 01:57 AM, Wengang Wang wrote: Problem reproduced(against mainline) with the above patch applied. Also with the hacking patch(attached). testcase is attached. (kworker/u:2,14465,1):dlm_assert_master_handler:1828 ERROR: DIE! Mastery assert from 0, but current owner is 1!

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
The last email you said it reproduced. Now you say it did not. I'm confused. On 10/12/2011 07:13 PM, Wengang Wang wrote: On 11-10-12 19:11, Sunil Mushran wrote: That's what ovm does. Have you reproduced it with ovm3 kernel? No, I have no reproductions. thanks, wengang. On 10/12/2011 07:07

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
which kernel? On 10/13/2011 04:35 PM, Wengang Wang wrote: On 11-10-13 09:09, Sunil Mushran wrote: The last email you said it reproduced. Now you say it did not. I'm confused. Oh? Did I. If I did, I meant it had reproductions in different customers's ENV, I had no reproduction in house

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=commitdiff;h=ff0a522e7db79625aa27a433467eb94c5e255718 Are you sure you have this patch? On 10/13/2011 05:19 PM, Wengang Wang wrote: 2.6.18-128. thanks, wengang. On 11-10-13 16:37, Sunil Mushran wrote: which kernel? On 10/13/2011 04

Re: [Ocfs2-devel] [PATCH] ocfs2: Commit transactions in error cases -v2

2011-10-12 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@oracle.com On 10/12/2011 12:22 AM, Wengang Wang wrote: There are three cases found that in error cases, journal transactions are not committed nor aborted. We should take care of these case by committing the transactions. Otherwise, there would left

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-12 Thread Sunil Mushran
I meant master_request (not query). We set refmap _before_ asserting. So that should not happen. On 10/12/2011 06:02 PM, Wengang Wang wrote: Hi Sunil, On 11-10-12 17:32, Sunil Mushran wrote: So you are saying a lockres can get purged before the node is asserting master to other nodes

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-12 Thread Sunil Mushran
That's what ovm does. Have you reproduced it with ovm3 kernel? On 10/12/2011 07:07 PM, Wengang Wang wrote: On 11-10-13 09:51, Wengang Wang wrote: On 11-10-12 18:47, Sunil Mushran wrote: I meant master_request (not query). We set refmap _before_ asserting. So that should not happen. Why can't

Re: [Ocfs2-devel] [PATCH] ocfs2: Commit transactions in error cases.

2011-10-11 Thread Sunil Mushran
The first two are ok. Have a comment for the last one. On 09/25/2011 02:13 AM, Wengang Wang wrote: Commit transactions in error cases. There are three cases found that in error cases, journal transactions are not committed nor aborted. We should take care of these case by committing the

[Ocfs2-devel] [PATCH 1/1] ocfs2/dlm: Use dlm-track_lock when adding resource to the tracking list

2011-10-07 Thread Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock to protect operations on dlm-tracking_list. But it was still using the older lock (dlm-spin_lock) to add new resources to the list. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmmaster.c |4

Re: [Ocfs2-devel] SEEK_DATA/SEEK_HOLE support for OCFS2

2011-09-29 Thread Sunil Mushran
It's in the q. Check jlbec's tree on oss.oracle.com. On 09/29/2011 06:19 AM, Jeff Liu wrote: Hello, I'd like to know anyone has started to write a patch for adding SEEK_DATA/SEEK_HOLE support to OCFS2 yet? If nobody working on it now, I'd like to implement it, actually, I have already

Re: [Ocfs2-devel] [PATCH] Wakeup down-convert thread just after clearing OCFS2_LOCK_UPCONVERT_FINISHING -v3

2011-09-22 Thread Sunil Mushran
Acked-by: Sunil Mushran sunil.mush...@oracle.com On 09/22/2011 05:49 PM, Wengang Wang wrote: When the lockres state UPCONVERT_FINISHING is cleared, we should wake up the downconvert thread incase that lockres is in the blocked queue. Currently we are not doing so and thus are at the mercy

Re: [Ocfs2-devel] Some fsck perf numbers

2011-09-21 Thread Sunil Mushran
/cache: 0MB / 0MB, write: 0MB, rate: 0.00MB/s Times real: 33.382s, user: 33.374s, sys: 0.001s On 09/16/2011 02:25 PM, Sunil Mushran wrote: I have been playing with fsck.ocfs2. Performance-wise. Have some interesting

[Ocfs2-devel] Some fsck perf numbers

2011-09-16 Thread Sunil Mushran
I have been playing with fsck.ocfs2. Performance-wise. Have some interesting numbers to share. This volume is 2T in size with 1.5 million files. Many exploded kernels trees + some large files. The particulars are listed below. I did 3 runs. The first set of numbers are vanilla fsck. In the

Re: [Ocfs2-devel] [PATCH] Wakeup down-convert thread just after clearing OCFS2_LOCK_UPCONVERT_FINISHING -v3

2011-09-15 Thread Sunil Mushran
I am fine with the kick in recover from dlm error. Not so in cluster lock. We have to be very very sure before meddling with that function. It is a state machine with many hidden gotchas. So is this patch for a bug encountered or just code audit. Also, what kind testing has been done. On

Re: [Ocfs2-devel] [PATCH] Wakeup down-convert thread just after clearing OCFS2_LOCK_UPCONVERT_FINISHING -v3

2011-09-15 Thread Sunil Mushran
http://people.redhat.com/~teigland/make_panic This test has been useful in exposing dlmglue issues. On 09/15/2011 10:15 AM, Sunil Mushran wrote: I am fine with the kick in recover from dlm error. Not so in cluster lock. We have to be very very sure before meddling with that function

Re: [Ocfs2-devel] [PATCH] Wakeup down-convert thread just after clearing OCFS2_LOCK_UPCONVERT_FINISHING -v3

2011-09-15 Thread Sunil Mushran
On 09/15/2011 05:42 PM, Wengang Wang wrote: Hi Sunil, On 11-09-15 10:21, Sunil Mushran wrote: http://people.redhat.com/~teigland/make_panic This test has been useful in exposing dlmglue issues. On 09/15/2011 10:15 AM, Sunil Mushran wrote: I am fine with the kick in recover from dlm error

Re: [Ocfs2-devel] [PATCH] Wakeup down-convert thread just after clearing OCFS2_LOCK_UPCONVERT_FINISHING -v2

2011-09-14 Thread Sunil Mushran
A better description would be: = When the lockres state UPCONVERT_FINISHING is cleared, we should wake up the downconvert thread incase that lockres is in the blocked queue. Currently we are not doing so and thus are at the mercy of another event waking up the

Re: [Ocfs2-devel] Can recovery be done in process context (as opposed to kthread)?

2011-09-10 Thread Sunil Mushran
On 09/09/2011 03:22 PM, Goldwyn Rodrigues wrote: Hi, I finally got back to improve the recovery procedure by offloading work to work queues. However, I would like to know if we can completely do away with ocfs2rec kthread. The process would just mark the nodes which need recovery and offload

Re: [Ocfs2-devel] block64 failure

2011-09-07 Thread Sunil Mushran
On 09/07/2011 10:57 AM, Joel Becker wrote: On Wed, Sep 07, 2011 at 10:10:43AM -0700, Sunil Mushran wrote: All, So the patches added to allow mounting volumes 16TB has a problem. The feature check of the jbd2 superblock is being done before the jbd2 superblock is actually read. It is being

Re: [Ocfs2-devel] ocfs2/cluster: Clean up messages in o2net

2011-08-30 Thread Sunil Mushran
Thanks. I'll fix the two. On 08/25/2011 06:01 PM, Dan Carpenter wrote: Hello Sunil Mushran, 1dfecf810e0e: ocfs2/cluster: Clean up messages in o2net Leads to the following Smatch complaint: fs/ocfs2/cluster/tcp.c +1704 o2net_start_connect(101) error: we previously assumed 'sc' could

Re: [Ocfs2-devel] [PATCH] ocfs2: unlock open_lock immediately

2011-08-30 Thread Sunil Mushran
Comments inlined. BTW, how common place is this race in your testing? If you can answer that, I would like to also know how you arrived at it. On 08/25/2011 07:50 PM, Wengang Wang wrote: There is a race between 2(+) nodes that calls iput_final() on same inode. time sequence is like the

Re: [Ocfs2-devel] ocfs2 cleanups, bug fixes for Linux kernel 3.1 - final drop

2011-08-21 Thread Sunil Mushran
On 08/21/2011 09:23 PM, Joel Becker wrote: On Thu, Jul 14, 2011 at 03:50:11PM -0700, Sunil Mushran wrote: Joel, Hopefully this is the final drop before the 3.1 merge window. I had posted most of the patches before. In fact the first 12 are exactly the same. The 13th one is modified

Re: [Ocfs2-devel] mount.ocfs2: Invalid argument while mounting /dev/mapper/xenconfig_part1 on /etc/xen/vm/. Check 'dmesg' for more information on this error.

2011-07-14 Thread Sunil Mushran
On 07/14/2011 02:04 AM, Tao Ma wrote: On 07/14/2011 03:47 PM, r.giord...@libero.it wrote: Hello, this is my scenario: 1)I've created a Pacemaker cluster with the following ocfs package on opensuse 11.3 64bit ocfs2console-1.8.0-2.1.x86_64 ocfs2-tools-o2cb-1.8.0-2.1.x86_64

[Ocfs2-devel] [PATCH 11/16] ocfs2/cluster: Fix output in file elapsed_time_in_ms

2011-07-14 Thread Sunil Mushran
The o2hb debugfs file, elapsed_time_in_ms, should return values only after the timer is armed atleast once. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/cluster/heartbeat.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/cluster

[Ocfs2-devel] [PATCH 14/16] ocfs2: Clean up messages in the fs

2011-07-14 Thread Sunil Mushran
Convert useful messages from ML_NOTICE to KERN_NOTICE to improve readability. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/journal.c |9 ++--- fs/ocfs2/quota_local.c | 13 + fs/ocfs2/slot_map.c|4 ++-- fs/ocfs2/super.c | 10

  1   2   3   4   5   6   7   8   9   10   >