Sure. It will need to be tested appropriately.
On Thu, Sep 10, 2015 at 4:49 AM, Joseph Qi wrote:
> Hi Junxiao & Sunil,
> Your comments would be appreciated.
>
> Thanks,
> Joseph
>
> On 2015/9/6 21:11, Joseph Qi wrote:
> > Comments for dlm_dispatch_work is described below:
This is because you are specifying a 128k cluster size. Refer to man
mkfs.ocfs2 for more.
On Mar 17, 2015 8:04 PM, Umarzuki Mochlis umarz...@gmail.com wrote:
Hi,
What I meant by total size is output of 'du -hs'
I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical
volume of
What is the output of the commands? The protocol is supposed to do the
unlocking on its own. See what is it blocked on. It could be that the node
that has the lock cannot unlock it because it cannot flush the journal to
disk.
On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua guozhong...@h3c.com wrote:
or not.
Sunil
On Tue, Aug 26, 2014 at 6:57 PM, Xue jiufei xuejiu...@huawei.com wrote:
Hi, Sunil
On 2014/8/26 1:13, Sunil Mushran wrote:
On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi joseph...@huawei.com
mailto:joseph...@huawei.com wrote:
On 2014/8/25 13:45, Sunil Mushran wrote
On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi joseph...@huawei.com wrote:
On 2014/8/25 13:45, Sunil Mushran wrote:
Please could you expand on that.
In our scenario, one node can mount multiple volumes across the
cluster.
For instance, N1 has mounted ocfs2 volumes say volume1, volume2
Functions in dlmdomain.c are only triggered during mount. So they cannot
trigger the deadlock as described above in this thread. I would leave them
as is.
On Aug 24, 2014 7:06 PM, Xue jiufei xuejiu...@huawei.com wrote:
Hi Sunil,
On 2014/8/23 1:08, Sunil Mushran wrote:
Allocs made via GFP_NOFS
Please could you expand on that.
On Aug 24, 2014 10:42 PM, Joseph Qi joseph...@huawei.com wrote:
On 2014/8/25 13:00, Sunil Mushran wrote:
Functions in dlmdomain.c are only triggered during mount. So they cannot
trigger the deadlock as described above in this thread. I would leave them
Allocs made via GFP_NOFS, by definition, should not trigger any reclaim
from the fs.
So this situation should never arise. That's why all allocs in the dlm have
NOFS.
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
You may want to do the same for the version file in dlm, dlmfs, etc.
On Tue, Nov 26, 2013 at 11:28 AM, Goldwyn Rodrigues rgold...@suse.dewrote:
The versioning information is confusing for end-users. The numbers
are stuck at 1.5.0 when the tools have moved to 1.8.3.
I suggest removing the
Acked-by: Sunil Mushran sunil.mush...@gmail.com
You may want to re-add removed MODULE_DESCRIPTION with a short blurb in
some existing files.
On Tue, Nov 26, 2013 at 3:37 PM, Goldwyn Rodrigues rgold...@suse.de wrote:
The versioning information is confusing for end-users. The numbers
are stuck
I see no need for a separate function. Just do
} else if (res-owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
if (test_bit(node, res-refmap))
dlm_lockres_clear_refmap_bit(dlm, res, node);
}
On Thu, Aug 1, 2013 at 5:05 AM, Xue jiufei xuejiu...@huawei.com wrote:
Function
What's the reasoning behind this patch?
On Jul 31, 2013, at 3:51 AM, Guozhonghua guozhong...@h3c.com wrote:
Hi,
I find some code may be not correct as reviewing the heart beat code and test
that.
The heart beat writing onto disk.
I have another question that why not encapsulate the
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Fri, Jun 28, 2013 at 1:47 PM, Andrew Morton a...@linux-foundation.orgwrote:
On Sun, 23 Jun 2013 18:39:16 +0800 Jeff Liu jeff@oracle.com wrote:
Hi Jiufei,
On 06/20/2013 07:13 PM, Xue jiufei wrote:
Function dlmlock_master
NAK. Current code looks ok.
On Fri, Jun 28, 2013 at 1:49 PM, Andrew Morton a...@linux-foundation.orgwrote:
Folks, 3.10 is nigh. Could we please have some review and test of this
patch?
From: Younger Liu younger@huawei.com
Subject: ocfs2: should call ocfs2_journal_access_di() before
AFAIR, this behavior has been there since day 1 and changing it will impact
performance negatively. I would recommend against making this change for
one app.
On Wed, Jun 26, 2013 at 6:50 PM, shencanquan shencanq...@huawei.com wrote:
On 2013/6/27 9:25, Andrew Morton wrote:
On Thu, 27 Jun
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Wed, May 22, 2013 at 8:50 AM, Joseph Qi joseph...@huawei.com wrote:
In dlm_request_all_locks, ret is type enum. But o2net_send_message
returns a type int value. Then it will never run into the following
error branch. So we should change
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Mon, May 20, 2013 at 2:36 AM, Joseph Qi joseph...@huawei.com wrote:
Below 3 functions have already been declared in dlmcommon.h, so we have
no need to declare them again in dlmrecovery.c.
dlm_complete_recovery_thread
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Tue, May 21, 2013 at 7:44 PM, shencanquan shencanq...@huawei.com wrote:
On 2013/5/22 10:38, xiaowei.hu wrote:
if there is error happen in , for example EIO in
__ocfs2_prepare_orphan_dir, ocfs2_prep_new_orphaned_file will release
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Mon, May 20, 2013 at 8:06 AM, Goldwyn Rodrigues rgold...@gmail.comwrote:
While removing a non-empty directory, the kernel dumps a message:
(rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39
Suppress the error message from being printed
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Tue, May 14, 2013 at 12:08 AM, Joseph Qi joseph...@huawei.com wrote:
Last time we found there is a lock/unlock bug in ocfs2_file_aio_write,
and then we did a thoroughly search for all lock resources in
ocfs2_inode_info, including rw, inode
The first node that gets the lock will do the actual recovery. The others will
get the lock and see a clean journal and skip the recovery. A thread should
never error out if it fails to get the lock. It should try and try again.
On May 17, 2013, at 11:27 PM, Joseph Qi joseph...@huawei.com
Resending as my reply bounced.
On Thu, May 9, 2013 at 10:01 AM, Sunil Mushran sunil.mush...@gmail.comwrote:
A better fix is to _not_ disconnect on o2net timeout once a connection has
been
cleanly established. Only disconnect on o2hb timeout.
The reconnects are a problem as we could lose
Looks good to me.
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Mon, May 6, 2013 at 7:43 AM, Joseph Qi joseph...@huawei.com wrote:
In ocfs2_file_aio_write, it does ocfs2_rw_lock first and then
ocfs2_inode_lock. But if ocfs2_inode_lock failed, it goes to out_sems
without unlocking rw
Do you know under what conditions does it create a new lock when it should
not?
This code should only trigger if the lockres is/was mastered on another
node.
Meaning this node will not know about the newlock. Meaning that code should
never trigger.
1949 if (lock-ml.cookie
NAK.
hb_task is a local variable that is not even accessed after kthread_stop().
The oops is in kthread_stop(). Points to a problem with get/put in
task_struct.
Not an ocfs2 issue.
On Mon, Dec 3, 2012 at 7:18 PM, xiaowei...@oracle.com wrote:
From: Xiaowei.Hu xiaowei...@oracle.com
Pid:
On Wed, Aug 22, 2012 at 9:01 PM, Jie Liu jeff@oracle.com wrote:
BTW, Sunil mentioned there already has an IO priority patch set but not
yet merged. However, I only searched
an old posts back to 2006 at:
http://www.digipedia.pl/usenet/thread/11947/7120/
Am I missing something?
No, I
On Wed, Aug 22, 2012 at 8:44 PM, Tao Ma t...@tao.ma wrote:
I guess the final solution will be WRITE_FUA, and I see btrfs uses it to
write out the superblock. It will be handled differently by the
underlying block layer so that it will not be in the elevator queue. It
should work but I am not
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Wed, Aug 22, 2012 at 2:38 AM, Jeff Liu jeff@oracle.com wrote:
Not sure if this patch does make sense or not, but it could make the
signature of those routines
in a consistent manner with others for heartbeating.
CC: Sunil Mushran
Yes. WRITE_SYNC should be good. Not FUA.
Also, you may want to look into using io priorities. The code is all there.
Just needs activation.
On Wed, Aug 22, 2012 at 10:13 AM, srinivas eeda srinivas.e...@oracle.comwrote:
On 8/22/2012 7:17 AM, Jie Liu wrote:
Hi All,
These days, I am
On Tue, Aug 14, 2012 at 11:28 PM, Xue jiufei xuejiu...@huawei.com wrote:
Sorry, I haven't described it clearly.
We trigger the BUG() in dlmrecovery.c:1923.
Lockres had copyed lvb from previous valid locks and then meet with
another lock with the EX level.
1907
30, 2011 at 02:14:04PM -0700, Sunil Mushran wrote:
Thanks. I'll fix the two.
On 08/25/2011 06:01 PM, Dan Carpenter wrote:
Hello Sunil Mushran,
1dfecf810e0e: ocfs2/cluster: Clean up messages in o2net
Leads to the following Smatch complaint:
fs/ocfs2/cluster/tcp.c +1704
On Mon, Aug 13, 2012 at 7:03 PM, Xue jiufei xuejiu...@huawei.com wrote:
A parallel umount on 4 nodes triggered a bug in
dlm_process_recovery_date(). Here’s the situation:
Receiving MIG_LOCKRES message, A node processes the locks in migratable
lockres. It copys lvb from migratable lockres
Acked-by: Sunil Mushran sunil.mush...@gmail.com
On Mon, Aug 13, 2012 at 7:06 PM, Xue jiufei xuejiu...@huawei.com wrote:
We trigger a bug in __dlm_lockres_reserve_ast() when we parallel umount
4 nodes. The situation is as follows:
1) Node A migrate all lockres it owned(eg. lockres
Thanks for your help.
On Fri, Aug 3, 2012 at 12:22 AM, Vincent ETIENEN v...@vetienne.net wrote:
Le 02/08/2012 23:08, Sunil Mushran a écrit :
On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE v...@vetienne.net wrote:
Hi
based on current git ( commit
From: Sunil Mushran smush...@yahoo.com
Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init.
Reported-and-Tested-by: Vincent Etienne vetie...@aprogsys.com
Signed-off-by: Sunil Mushran sunil.mush...@gmail.com
---
fs/ocfs2/symlink.c |2 +-
1 files changed, 1 insertions(+), 1
On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE v...@vetienne.net wrote:
Hi
based on current git ( commit 1a9b4993b70fb1884716902774dc9025b457760d )
and reverting commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1
commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1
Author: Al Viro
The fallocate() oops is probably the same that is fixed by this patch.
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=commit;h=a2118b301104a24381b414bc93371d666fe8d43a
Is in the list of patches that are ready to be pushed.
On Fri, Jul 27, 2012 at 1:32 PM, Mark Fasheh mfas...@suse.de wrote:
On Thu, Jul 26, 2012 at 04:05:05PM +0300, Dan Carpenter wrote:
My static checker complains that this is called with a spin_lock held
in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason
we have not
On Wed, Jul 11, 2012 at 1:51 AM, Joel Becker jl...@evilplan.org wrote:
On Wed, Jul 11, 2012 at 02:49:56PM +0800, Junxiao Bi wrote:
Signed-off-by: Junxiao Bi junxiao...@oracle.com
---
fs/ocfs2/dlm/dlmmaster.c |4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=mw-3.4-mar15
I had prepared some patches sometime ago that could be pushed to mainline.
Though some patches may need to be removed as they look to be in this list.
On Fri, Jul 6, 2012 at 12:44 AM, Joel Becker jl...@evilplan.org
On Tue, Jul 17, 2012 at 12:10 AM, Junxiao Bi junxiao...@oracle.com wrote:
In the target node of the dlm lock migration, the logic to find
the local dlm lock is wrong, it shouldn't change the loop variable
lock in the list_for_each_entry loop. This will cause a NULL-pointer
accessing crash.
On Tue, May 29, 2012 at 5:41 PM, Xiaowei xiaowei...@oracle.com wrote:
On 05/30/2012 06:09 AM, Sunil Mushran wrote:
I would suggest exploring adding this in dlm hb down event. Checking live
map all
over the place is hacky. We do it more than we should right now. Let's not
add to the
mess
On Thu, May 24, 2012 at 10:53 PM, xiaowei...@oracle.com wrote:
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 01ebfd0..62659e8 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -555,6 +555,7 @@ static int dlm_remaster_locks(struct
with the serialization accounting for writes. This patch
seperates the handler functions to avoid this issue.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/aops.c | 44
1 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/fs/ocfs2
averting the case in which lock is set to NULL.
Reported-by: Julia Lawall ju...@diku.dk
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/dlm/dlmrecovery.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm
This patch silences this message.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/quota_global.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b24eab3..2a3f12c 100644
--- a/fs/ocfs2
fallocate() was oopsing on ocfs2 because we were passing in a
NULL file pointer.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/file.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 061591a..8f30e74 100644
nlink_t was replaced as per the suggestion in the following link.
https://lkml.org/lkml/2012/2/2/577
Reported-by: Al Viro v...@zeniv.linux.org.uk
Signed-of-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/namei.c |6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock
to protect operations on dlm-tracking_list. But it was still using the
older lock (dlm-spin_lock) to add new resources to the list.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/dlm/dlmmaster.c |4
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/mmap.h | 18 ++
fs/ocfs2/ocfs2_trace.h | 19 +++
fs/ocfs2/quota.h| 16 ++--
fs/ocfs2/quota_global.c | 20 ++--
fs/ocfs2/quota_local.c | 19
exceeds the total bit count. In each instance
the bitmap is correct. Only the free bit count is incorrect.
This patch checks the current bit value and increments the free bit count
only if the bit was previously set. It also prints information to allow
us to debug further.
Signed-off-by: Sunil Mushran
bast queued and flushed,before the ast was queued
Unlikely with o2dlm. dlmthread always sends ASTs before BASTs.
Can you recreate the entire lockres? A full dump may yield more
information.
Sunil
On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote:
I am trying to fix bug13611997,CT's
Moreover what is lockres_clear_pending doing in 1.4. That code
is not meant for 1.4. It fixes a problem associated with fsdlm.
It was left out of 1.4 for a reason.
Meaning this bug was introduced by the patch that introduced this
one in 1.4.
On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote:
hold spin locks,
then it start to execute the proxy ast handler , process bast request
from nodeB,
then dlmthread flushed the bast, after this node A start to queue its
ast in ocfs2_dlm_lock() function.
Thanks,
Xiaowei
On 02/22/2012 01:48 AM, Sunil Mushran wrote:
bast queued and flushed
= 0x8105d2a284a8
},
l_lock_num_prmode = 0,
l_lock_num_exmode = 0,
l_lock_num_prmode_failed = 0,
l_lock_num_exmode_failed = 0,
l_lock_total_prmode = 0,
l_lock_total_exmode = 0,
l_lock_max_prmode = 0,
l_lock_max_exmode = 0,
l_lock_refresh = 0
}
On 02/22/2012 08:45 AM, Sunil Mushran wrote:
Both AST
hmm... I would say NAK because config_group_item_type_name() could
change in the future. And there is nothing wrong with the current
code.
On 02/13/2012 05:50 AM, Dan Carpenter wrote:
If ret is NULL, then hs is also NULL, so there is no need to free
it. config_group_init_type_name() can't fail
On 02/13/2012 12:29 PM, Dan Carpenter wrote:
On Mon, Feb 13, 2012 at 12:04:09PM -0800, Joel Becker wrote:
On Mon, Feb 13, 2012 at 04:50:47PM +0300, Dan Carpenter wrote:
If ret is NULL, then hs is also NULL, so there is no need to free
it. config_group_init_type_name() can't fail if the name
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
On 02/08/2012 10:42 PM, Jeff Liu wrote:
Hello,
Since ENXIO only means offset beyond EOF for SEEK_DATA/SEEK_HOLE,
Hence we should return the internal error unchanged if ocfs2_inode_lock() or
ocfs2_get_clusters_nocache() call failed rather
sob
On 01/30/2012 09:51 PM, Srinivas Eeda wrote:
When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.
The patch disables interrupts when acquiring dc_task_lock
fallocate() was oopsing on ocfs2 because we were passing in a
NULL file pointer.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/file.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 061591a..8f30e74 100644
Comments inlined.
On 01/28/2012 06:13 PM, Srinivas Eeda wrote:
When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ for
I/O completion it deadlock itself trying to get same spinlock in
ocfs2_wake_downconvert_thread
The patch disables interrupts when acquiring dc_task_lock
We've seen this too. The problem happens because of the patch added to delay
dropping of the dentry locks (first patch below). The other two are related.
It was added to avoid a deadlock in quotas but adds problems of its own.
Srini has studied this issue and may be able to expand on this. The
Joel,
Please pull 6 patches (bug fixes) from the following repo.
git://oss.oracle.com/git/smushran/linux-2.6.git mw-3.3-jan13
BTW, not sure if I emailed before but we have to rollback 3 patches
related to deletes. These patches were added to fix deadlocks with
quotas. Well, it has just broken
On 12/20/2011 06:49 PM, Wengang Wang wrote:
This is a fix on dlm_clean_master_list()
During the hash table browsing, we remove mle from hash table then free
the memory on the last reference. So we have to use a _safe() version
of the browsing function when doing that.
This fixes Orabug
Acked-by: Sunil Mushran sunil.mush...@oracle.com
On 12/05/2011 09:18 PM, Tao Ma wrote:
On 12/06/2011 12:57 PM, Noboru Iwamatsu wrote:
Under heavy I/O load, writing the disk heartbeat can be forced
to wait for minutes, and this causes the node to be fenced.
This patch tries to use WRITE_SYNC
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/mmap.h | 18 ++
fs/ocfs2/ocfs2_trace.h | 19 +++
fs/ocfs2/quota.h| 16 ++--
fs/ocfs2/quota_global.c | 20 ++--
fs/ocfs2/quota_local.c | 19
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock
to protect operations on dlm-tracking_list. But it was still using the
older lock (dlm-spin_lock) to add new resources to the list.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/dlm/dlmmaster.c |4
averting the case in which lock is set to NULL.
Reported-by: Julia Lawall ju...@diku.dk
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/dlm/dlmrecovery.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm
Patch fixes some possible null pointer dereferences that were detected by the
static code analyser, smatch.
Reported-by: Dan Carpenter erro...@gmail.com
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/cluster/tcp.c | 10 +-
1 files changed, 5 insertions(+), 5
This patch silences this message.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/quota_global.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b24eab3..2a3f12c 100644
--- a/fs/ocfs2
exceeds the total bit count. In each instance
the bitmap is correct. Only the free bit count is incorrect.
This patch checks the current bit value and increments the free bit count
only if the bit was previously set. It also prints information to allow
us to debug further.
Signed-off-by: Sunil Mushran
fstype is a handy way to format the volume with parameters that are thought
to be useful for that use-case. The result of this is printed during format by
way of the parameters selected. man mkfs.ocfs2 has a blurb about the features
it enabled by default.
On 11/16/2011 08:45 AM, Artur Baruchi
Yes. But this is just the features. It also selects the appropriate cluster
size, block size,
journal size, etc. All the params selected are printed by mkfs. You also have
the option of
running with the --dry-option to see the params.
On 11/16/2011 09:41 AM, Artur Baruchi wrote:
I just found
I think it got lost in the shuffle. We had decided to use the list_for_each().
The code is simpler to understand than the other proposed fix.
Joel, do you want me to send a patch?
On 11/02/2011 12:39 AM, Dan Carpenter wrote:
What ever happened with this? The bug is still there in the latest
Joel, Please add this to the linux-next branch.
Original Message
Subject:[Ocfs2-devel] [PATCH] ocfs2: Add a missing journal credit in
ocfs2_link_credits() -v2
Date: Wed, 19 Oct 2011 09:34:19 +0800
From: xiaowei...@oracle.com
To: ocfs2-devel@oss.oracle.com
CC:
On 10/14/2011 01:57 AM, Wengang Wang wrote:
Problem reproduced(against mainline) with the above patch applied. Also with
the hacking
patch(attached).
testcase is attached.
(kworker/u:2,14465,1):dlm_assert_master_handler:1828 ERROR: DIE! Mastery
assert from 0, but current owner is 1!
The last email you said it reproduced. Now you say it did not.
I'm confused.
On 10/12/2011 07:13 PM, Wengang Wang wrote:
On 11-10-12 19:11, Sunil Mushran wrote:
That's what ovm does. Have you reproduced it with ovm3 kernel?
No, I have no reproductions.
thanks,
wengang.
On 10/12/2011 07:07
which kernel?
On 10/13/2011 04:35 PM, Wengang Wang wrote:
On 11-10-13 09:09, Sunil Mushran wrote:
The last email you said it reproduced. Now you say it did not.
I'm confused.
Oh? Did I. If I did, I meant it had reproductions in different customers's
ENV,
I had no reproduction in house
http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=commitdiff;h=ff0a522e7db79625aa27a433467eb94c5e255718
Are you sure you have this patch?
On 10/13/2011 05:19 PM, Wengang Wang wrote:
2.6.18-128.
thanks,
wengang.
On 11-10-13 16:37, Sunil Mushran wrote:
which kernel?
On 10/13/2011 04
Acked-by: Sunil Mushran sunil.mush...@oracle.com
On 10/12/2011 12:22 AM, Wengang Wang wrote:
There are three cases found that in error cases, journal transactions are not
committed nor aborted. We should take care of these case by committing the
transactions. Otherwise, there would left
I meant master_request (not query). We set refmap _before_
asserting. So that should not happen.
On 10/12/2011 06:02 PM, Wengang Wang wrote:
Hi Sunil,
On 11-10-12 17:32, Sunil Mushran wrote:
So you are saying a lockres can get purged before the node is asserting
master to other nodes
That's what ovm does. Have you reproduced it with ovm3 kernel?
On 10/12/2011 07:07 PM, Wengang Wang wrote:
On 11-10-13 09:51, Wengang Wang wrote:
On 11-10-12 18:47, Sunil Mushran wrote:
I meant master_request (not query). We set refmap _before_
asserting. So that should not happen.
Why can't
The first two are ok. Have a comment for the last one.
On 09/25/2011 02:13 AM, Wengang Wang wrote:
Commit transactions in error cases.
There are three cases found that in error cases, journal transactions are not
committed nor aborted. We should take care of these case by committing the
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm-track_lock
to protect operations on dlm-tracking_list. But it was still using the
older lock (dlm-spin_lock) to add new resources to the list.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/dlm/dlmmaster.c |4
It's in the q. Check jlbec's tree on oss.oracle.com.
On 09/29/2011 06:19 AM, Jeff Liu wrote:
Hello,
I'd like to know anyone has started to write a patch for adding
SEEK_DATA/SEEK_HOLE support to OCFS2 yet?
If nobody working on it now, I'd like to implement it, actually, I have
already
Acked-by: Sunil Mushran sunil.mush...@oracle.com
On 09/22/2011 05:49 PM, Wengang Wang wrote:
When the lockres state UPCONVERT_FINISHING is cleared,
we should wake up the downconvert thread incase that lockres
is in the blocked queue. Currently we are not doing so and thus
are at the mercy
/cache: 0MB / 0MB, write: 0MB, rate: 0.00MB/s
Times real: 33.382s, user: 33.374s, sys: 0.001s
On 09/16/2011 02:25 PM, Sunil Mushran wrote:
I have been playing with fsck.ocfs2. Performance-wise. Have some
interesting
I have been playing with fsck.ocfs2. Performance-wise. Have some
interesting numbers to share.
This volume is 2T in size with 1.5 million files. Many exploded
kernels trees + some large files. The particulars are listed below.
I did 3 runs.
The first set of numbers are vanilla fsck.
In the
I am fine with the kick in recover from dlm error. Not so in cluster lock.
We have to be very very sure before meddling with that function. It is
a state machine with many hidden gotchas.
So is this patch for a bug encountered or just code audit. Also, what kind
testing has been done.
On
http://people.redhat.com/~teigland/make_panic
This test has been useful in exposing dlmglue issues.
On 09/15/2011 10:15 AM, Sunil Mushran wrote:
I am fine with the kick in recover from dlm error. Not so in cluster lock.
We have to be very very sure before meddling with that function
On 09/15/2011 05:42 PM, Wengang Wang wrote:
Hi Sunil,
On 11-09-15 10:21, Sunil Mushran wrote:
http://people.redhat.com/~teigland/make_panic
This test has been useful in exposing dlmglue issues.
On 09/15/2011 10:15 AM, Sunil Mushran wrote:
I am fine with the kick in recover from dlm error
A better description would be:
=
When the lockres state UPCONVERT_FINISHING is cleared,
we should wake up the downconvert thread incase that lockres
is in the blocked queue. Currently we are not doing so and thus
are at the mercy of another event waking up the
On 09/09/2011 03:22 PM, Goldwyn Rodrigues wrote:
Hi,
I finally got back to improve the recovery procedure by offloading
work to work queues. However, I would like to know if we can
completely do away with ocfs2rec kthread. The process would just mark
the nodes which need recovery and offload
On 09/07/2011 10:57 AM, Joel Becker wrote:
On Wed, Sep 07, 2011 at 10:10:43AM -0700, Sunil Mushran wrote:
All,
So the patches added to allow mounting volumes 16TB has a problem.
The feature check of the jbd2 superblock is being done before the
jbd2 superblock is actually read.
It is being
Thanks. I'll fix the two.
On 08/25/2011 06:01 PM, Dan Carpenter wrote:
Hello Sunil Mushran,
1dfecf810e0e: ocfs2/cluster: Clean up messages in o2net
Leads to the following Smatch complaint:
fs/ocfs2/cluster/tcp.c +1704 o2net_start_connect(101)
error: we previously assumed 'sc' could
Comments inlined.
BTW, how common place is this race in your testing? If you can
answer that, I would like to also know how you arrived at it.
On 08/25/2011 07:50 PM, Wengang Wang wrote:
There is a race between 2(+) nodes that calls iput_final() on same inode.
time sequence is like the
On 08/21/2011 09:23 PM, Joel Becker wrote:
On Thu, Jul 14, 2011 at 03:50:11PM -0700, Sunil Mushran wrote:
Joel,
Hopefully this is the final drop before the 3.1 merge window. I had posted
most
of the patches before. In fact the first 12 are exactly the same. The 13th
one
is modified
On 07/14/2011 02:04 AM, Tao Ma wrote:
On 07/14/2011 03:47 PM, r.giord...@libero.it wrote:
Hello,
this is my scenario:
1)I've created a Pacemaker cluster with the following ocfs package on
opensuse
11.3 64bit
ocfs2console-1.8.0-2.1.x86_64
ocfs2-tools-o2cb-1.8.0-2.1.x86_64
The o2hb debugfs file, elapsed_time_in_ms, should return values only after the
timer is armed atleast once.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/cluster/heartbeat.c |9 ++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/cluster
Convert useful messages from ML_NOTICE to KERN_NOTICE to improve readability.
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
fs/ocfs2/journal.c |9 ++---
fs/ocfs2/quota_local.c | 13 +
fs/ocfs2/slot_map.c|4 ++--
fs/ocfs2/super.c | 10
1 - 100 of 958 matches
Mail list logo