Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote: One way to skip a lockres in the purgelist is to list_del_init() and list_add_tail(). That simplifies the patch a lot. I have attached a quick dirty patch. See if that satisfies all the requirements. As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? Joel -- All alone at the end of the evening When the bright lights have faded to blue. I was thinking about a woman who had loved me And I never knew Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On 6/17/2010 1:32 AM, Joel Becker wrote: On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote: One way to skip a lockres in the purgelist is to list_del_init() and list_add_tail(). That simplifies the patch a lot. I have attached a quick dirty patch. See if that satisfies all the requirements. As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? right. that's why I didn't want to move used lockres to tail :) Joel ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On 10-06-17 01:53, Srinivas Eeda wrote: On 6/15/2010 11:06 PM, Wengang Wang wrote: still the question. If you have sent DEREF request to the master, and the lockres became in-use again, then the lockres remains in the hash table and also in the purge list. So Yes, that's a possibility. But there is not much we could do to cover that window other than making the non master nodes to avoid such races. Patch 2/2 fixes one such race. Yes, we should make non master nodes to avoid such races. But you only fixed one of such races by patch 2/2 :). And probably we can't make sure how many such races there. A know case is dlm_mig_lockres_handler(). We dropped dlm-spinlock and res-spinlock after dlm_lookup_lockres(). Here it can be set DROPPING_REF state in dlm_thread(). dlm_thread() then drop the spinlocks and set deref msg. Before dlm_thread() take the spinlocks back, dlm_mig_lockres_handler() continues, A lock(s) is added to the lockres. The lockres became in use then. dlm_thread() take back the spinlocks, found the lockres is in use, keep it in hashtable and in purge list. Your patch 2/2 fixes that problem well. So far I have no good idea to fix all such potential races.. regards, wengang. 1) If this node is the last ref, there is a possibility that the master purged the lockres after receiving DEREF request from this node. In this case, when this node does dlmlock_remote(), the lockres won't be found on the master. How to deal with it? patch 2/2 fixes this race. dlm_get_lock_resource will either wait for the lockres to get purged and starts everything fresh or marks the lockres in use so dlm_thread won't purge it. 2) The lockres on this node is going to be purged again, it means it will send secondary DEREFs to the master. This is not good I think. right, not a good idea to send deref again. We have to fix those cases. A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after sending a DEREF request againt this lockres. Also redo master reqeust before locking on it. if you are referring to the hole in dlmlock_remote, patch 2/2 fixes it. Please review that patch and let me know :) Regards, wengang. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On 06/17/2010 01:35 AM, Srinivas Eeda wrote: On 6/17/2010 1:32 AM, Joel Becker wrote: On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote: One way to skip a lockres in the purgelist is to list_del_init() and list_add_tail(). That simplifies the patch a lot. I have attached a quick dirty patch. See if that satisfies all the requirements. As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? right. that's why I didn't want to move used lockres to tail :) No. There is no need for time ordering. Or, am I missing something? We delay the purge incase the file system changes its mind and wants to reuse it. By delaying, we hold onto the mastery information. That's it. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On 6/17/2010 7:48 AM, Sunil Mushran wrote: On 06/17/2010 01:35 AM, Srinivas Eeda wrote: On 6/17/2010 1:32 AM, Joel Becker wrote: On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote: One way to skip a lockres in the purgelist is to list_del_init() and list_add_tail(). That simplifies the patch a lot. I have attached a quick dirty patch. See if that satisfies all the requirements. As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? right. that's why I didn't want to move used lockres to tail :) No. There is no need for time ordering. Or, am I missing something? We delay the purge incase the file system changes its mind and wants to reuse it. By delaying, we hold onto the mastery information. That's it. So, should we preserve this? or purge the lockres when it's on purge list? OR we could just update the last_used and move it to end of the list if it can be purged. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
Sunil, as of now, there is still a window in dlm_get_lock_resource, where it finds the lockres but it doesn't protect it from getting purged. Second patch fixes this by marking it in_use, can you please review that one as well. Thanks, --Srini On 6/17/2010 8:06 AM, Sunil Mushran wrote: On 06/15/2010 11:06 PM, Wengang Wang wrote: still the question. If you have sent DEREF request to the master, and the lockres became in-use again, then the lockres remains in the hash table and also in the purge list. So 1) If this node is the last ref, there is a possibility that the master purged the lockres after receiving DEREF request from this node. In this case, when this node does dlmlock_remote(), the lockres won't be found on the master. How to deal with it? 2) The lockres on this node is going to be purged again, it means it will send secondary DEREFs to the master. This is not good I think. A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after sending a DEREF request againt this lockres. Also redo master reqeust before locking on it. The fix we are working towards is to ensure that we set DLM_LOCK_RES_DROPPING_REF once we are determined to purge the lockres. As in, we should not let go of the spinlock before we have either set the flag or decided against purging that resource. Once the flag is set, new users looking up the resource via dlm_get_lock_resource() will notice the flag and will then wait for that flag to be cleared before looking up the lockres hash again. If all goes well, the lockres will not be found (because it has since been unhashed) and it will be forced to go thru the full mastery process. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On Thu, Jun 17, 2010 at 07:48:38AM -0700, Sunil Mushran wrote: On 06/17/2010 01:35 AM, Srinivas Eeda wrote: On 6/17/2010 1:32 AM, Joel Becker wrote: As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? right. that's why I didn't want to move used lockres to tail :) No. There is no need for time ordering. Or, am I missing something? We delay the purge incase the file system changes its mind and wants to reuse it. By delaying, we hold onto the mastery information. That's it. The comment is right there: /* Since resources are added to the purge list * in tail order, we can stop at the first * unpurgable resource -- anyone added after * him will have a greater last_used value */ break; So we're going to break out of this loop as soon we see a later time. Anything you've moved to the back will be ignored until enough time passes that the things in front of it are candidates for purge. You could, of course, just change the 'break' to a 'continue' and find those, but I kind of like the short-circuit behavior. I don't know how long this list is, but walking a whole bunch of not yet lockreses just to hopefully find one at the end seems silly. I think it is better to leave the busy ones at the front of the list and just walk past them if they can't be dealt with now. Joel -- Life's Little Instruction Book #157 Take time to smell the roses. Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
We don't have to update the last used to move it to tail. If it is now in use, means it is about to be taken out of the purgelist. We just need to move it from our way for the time being. On Jun 17, 2010, at 9:55 AM, Srinivas Eeda srinivas.e...@oracle.com wrote: On 6/17/2010 7:48 AM, Sunil Mushran wrote: On 06/17/2010 01:35 AM, Srinivas Eeda wrote: On 6/17/2010 1:32 AM, Joel Becker wrote: On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote: One way to skip a lockres in the purgelist is to list_del_init() and list_add_tail(). That simplifies the patch a lot. I have attached a quick dirty patch. See if that satisfies all the requirements. As far as I can tell from reading the code, the time_after() check is because they are time ordered. Wouldn't moving it to the end violate that? right. that's why I didn't want to move used lockres to tail :) No. There is no need for time ordering. Or, am I missing something? We delay the purge incase the file system changes its mind and wants to reuse it. By delaying, we hold onto the mastery information. That's it. So, should we preserve this? or purge the lockres when it's on purge list? OR we could just update the last_used and move it to end of the list if it can be purged. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: check dlm_state under spinlock
On 06/15/2010 09:08 PM, Wengang Wang wrote: We should check dlm-dlm_state under dlm-spinlock though maybe in this case it doesn't hurt. NAK. dlm-dlm_state is protected by dlm_domain_lock which is held at that time. /* NOTE: Next three are protected by dlm_domain_lock */ struct kref dlm_refs; enum dlm_ctxt_state dlm_state; unsigned int num_joins; ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 2/2] ocfs2: o2dlm fix race in purge lockres and newlock (orabug 9094491)
This patch looks ok on the surface. Should be usable. But checking into the repo is another matter. My problem is that this flag is very much like inflight_locks but is not the same. inflight_locks are taken only on lockres' that are mastered by the node. This new flag does the same but for non mastered locks too. A better solution will be to merge the two. And that means cleaning up inflight_locks. The reason for the NAK for check in is that the code is adding more messiness to an area that is already very messy. Sunil On 06/15/2010 09:43 PM, Srinivas Eeda wrote: This patch fixes the following hole. dlmlock tries to create a new lock on a lockres that is on purge list. It calls dlm_get_lockresource and later adds a lock to blocked list. But in this window, dlm_thread can purge the lockres and unhash it. This will cause a BUG, as when the AST comes back from the master lockres is not found This patch marks the lockres with a new state DLM_LOCK_RES_IN_USE which would protect lockres from dlm_thread purging it. Signed-off-by: Srinivas Eedasrinivas.e...@oracle.com --- fs/ocfs2/dlm/dlmcommon.h |1 + fs/ocfs2/dlm/dlmlock.c |4 fs/ocfs2/dlm/dlmmaster.c |5 - fs/ocfs2/dlm/dlmthread.c |1 + 4 files changed, 10 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h index 0102be3..6cf3c08 100644 --- a/fs/ocfs2/dlm/dlmcommon.h +++ b/fs/ocfs2/dlm/dlmcommon.h @@ -279,6 +279,7 @@ static inline void __dlm_set_joining_node(struct dlm_ctxt *dlm, #define DLM_LOCK_RES_DIRTY0x0008 #define DLM_LOCK_RES_IN_PROGRESS 0x0010 #define DLM_LOCK_RES_MIGRATING0x0020 +#define DLM_LOCK_RES_IN_USE 0x0100 #define DLM_LOCK_RES_DROPPING_REF 0x0040 #define DLM_LOCK_RES_BLOCK_DIRTY 0x1000 #define DLM_LOCK_RES_SETREF_INPROG0x2000 diff --git a/fs/ocfs2/dlm/dlmlock.c b/fs/ocfs2/dlm/dlmlock.c index 777..501ac40 100644 --- a/fs/ocfs2/dlm/dlmlock.c +++ b/fs/ocfs2/dlm/dlmlock.c @@ -134,6 +134,8 @@ static enum dlm_status dlmlock_master(struct dlm_ctxt *dlm, if (status != DLM_NORMAL lock-ml.node != dlm-node_num) { /* erf. state changed after lock was dropped. */ + /* DLM_LOCK_RES_IN_USE is set in dlm_get_lock_resource */ + res-state= ~DLM_LOCK_RES_IN_USE; spin_unlock(res-spinlock); dlm_error(status); return status; @@ -180,6 +182,7 @@ static enum dlm_status dlmlock_master(struct dlm_ctxt *dlm, kick_thread = 1; } } + res-state= ~DLM_LOCK_RES_IN_USE; /* reduce the inflight count, this may result in the lockres * being purged below during calc_usage */ if (lock-ml.node == dlm-node_num) @@ -246,6 +249,7 @@ static enum dlm_status dlmlock_remote(struct dlm_ctxt *dlm, spin_lock(res-spinlock); res-state= ~DLM_LOCK_RES_IN_PROGRESS; + res-state= ~DLM_LOCK_RES_IN_USE; lock-lock_pending = 0; if (status != DLM_NORMAL) { if (status == DLM_RECOVERING diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 9289b43..f0f2d97 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -719,6 +719,7 @@ lookup: if (tmpres) { int dropping_ref = 0; + tmpres-state |= DLM_LOCK_RES_IN_USE; spin_unlock(dlm-spinlock); spin_lock(tmpres-spinlock); @@ -731,8 +732,10 @@ lookup: if (tmpres-owner == dlm-node_num) { BUG_ON(tmpres-state DLM_LOCK_RES_DROPPING_REF); dlm_lockres_grab_inflight_ref(dlm, tmpres); - } else if (tmpres-state DLM_LOCK_RES_DROPPING_REF) + } else if (tmpres-state DLM_LOCK_RES_DROPPING_REF) { + tmpres-state= ~DLM_LOCK_RES_IN_USE; dropping_ref = 1; + } spin_unlock(tmpres-spinlock); /* wait until done messaging the master, drop our ref to allow diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c index fb0be6c..2b82e0b 100644 --- a/fs/ocfs2/dlm/dlmthread.c +++ b/fs/ocfs2/dlm/dlmthread.c @@ -93,6 +93,7 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res) int __dlm_lockres_unused(struct dlm_lock_resource *res) { if (!__dlm_lockres_has_locks(res) + !(res-state DLM_LOCK_RES_IN_USE) (list_empty(res-dirty) !(res-state DLM_LOCK_RES_DIRTY))) { /* try not to scan the bitmap unless the first two * conditions are already true */ ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist
On 10-06-17 08:06, Sunil Mushran wrote: On 06/15/2010 11:06 PM, Wengang Wang wrote: still the question. If you have sent DEREF request to the master, and the lockres became in-use again, then the lockres remains in the hash table and also in the purge list. So 1) If this node is the last ref, there is a possibility that the master purged the lockres after receiving DEREF request from this node. In this case, when this node does dlmlock_remote(), the lockres won't be found on the master. How to deal with it? 2) The lockres on this node is going to be purged again, it means it will send secondary DEREFs to the master. This is not good I think. A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after sending a DEREF request againt this lockres. Also redo master reqeust before locking on it. The fix we are working towards is to ensure that we set DLM_LOCK_RES_DROPPING_REF once we are determined to purge the lockres. As in, we should not let go of the spinlock before we have either set the flag or decided against purging that resource. Once the flag is set, new users looking up the resource via dlm_get_lock_resource() will notice the flag and will then wait for that flag to be cleared before looking up the lockres hash again. If all goes well, the lockres will not be found (because it has since been unhashed) and it will be forced to go thru the full mastery process. That is ideal. In many cases the lockres is not got via dlm_get_lock_resource(), but via dlm_lookup_lockres()/__dlm_lookup_lockres, which doesn't set the new IN-USE state, directly. dlm_lookup_lockres() takes and drops dlm-spinlock. And some of caller of __dlm_lookup_lockres() drops the spinlock as soon as it got the lockres. Such paths access the lockres later after dropping dlm-spinlock and res-spinlock. So there is a window that dlm_thread() get a chance to take the dlm-spinlock and res-spinlock and set the DROPPING_REF state. So whether new users can get the lockres depends on how new it is. If finds the lockres after DROPPING_REF state is set, sure it works well. But if it find it before DROPPING_REF is set, it won't protect the lockres from purging since even it gets the lockres, the lockres can still in unused state. regards, wengang. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: check dlm_state under spinlock
On 10-06-17 18:35, Sunil Mushran wrote: On 06/15/2010 09:08 PM, Wengang Wang wrote: We should check dlm-dlm_state under dlm-spinlock though maybe in this case it doesn't hurt. NAK. dlm-dlm_state is protected by dlm_domain_lock which is held at that time. /* NOTE: Next three are protected by dlm_domain_lock */ struct kref dlm_refs; enum dlm_ctxt_state dlm_state; unsigned int num_joins; Yes, that's ture. thanks. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 0/2] ocfs2: make xattr work with new local alloc reservation.
Hi all, The old xattr codes deem that we can allocate enough contiguous clusters from the allocator, it works with the old local alloc allocator. But as Mark has added reservation code to ocfs2, now it isn't the good hypothesis any more. So these 2 patches try to let xattr work with it. Regards, Tao ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH] ocfs2: Make xattr reflink work with new local alloc reservation.
The new reservation code in local alloc has add the limitation that the caller should handle the case that the local alloc doesn't give use enough contiguous clusters. It make the old xattr reflink code broken. So this patch udpate the xattr reflink code so that it can handle the case that local alloc give us one cluster at a time. Signed-off-by: Tao Ma tao...@oralce.com --- fs/ocfs2/xattr.c | 125 ++--- 1 files changed, 89 insertions(+), 36 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 894734b..bc52840 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -6805,16 +6805,15 @@ out: return ret; } -static int ocfs2_reflink_xattr_buckets(handle_t *handle, +static int ocfs2_reflink_xattr_bucket(handle_t *handle, u64 blkno, u64 new_blkno, u32 clusters, + u32 *cpos, int num_buckets, struct ocfs2_alloc_context *meta_ac, struct ocfs2_alloc_context *data_ac, struct ocfs2_reflink_xattr_tree_args *args) { int i, j, ret = 0; struct super_block *sb = args-reflink-old_inode-i_sb; - u32 bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(sb)); - u32 num_buckets = clusters * bpc; int bpb = args-old_bucket-bu_blocks; struct ocfs2_xattr_value_buf vb = { .vb_access = ocfs2_journal_access, @@ -6833,14 +6832,6 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, break; } - /* -* The real bucket num in this series of blocks is stored -* in the 1st bucket. -*/ - if (i == 0) - num_buckets = le16_to_cpu( - bucket_xh(args-old_bucket)-xh_num_buckets); - ret = ocfs2_xattr_bucket_journal_access(handle, args-new_bucket, OCFS2_JOURNAL_ACCESS_CREATE); @@ -6854,6 +6845,18 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, bucket_block(args-old_bucket, j), sb-s_blocksize); + /* +* Record the start cpos so that we can use it to initialize +* our xattr tree we also set the xh_num_bucket for the new +* bucket. +*/ + if (i == 0) { + *cpos = le32_to_cpu(bucket_xh(args-new_bucket)- + xh_entries[0].xe_name_hash); + bucket_xh(args-new_bucket)-xh_num_buckets = + cpu_to_le16(num_buckets); + } + ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket); ret = ocfs2_reflink_xattr_header(handle, args-reflink, @@ -6883,6 +6886,7 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, } ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket); + ocfs2_xattr_bucket_relse(args-old_bucket); ocfs2_xattr_bucket_relse(args-new_bucket); } @@ -6891,6 +6895,74 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, ocfs2_xattr_bucket_relse(args-new_bucket); return ret; } + +static int ocfs2_reflink_xattr_buckets(handle_t *handle, + struct inode *inode, + struct ocfs2_reflink_xattr_tree_args *args, + struct ocfs2_extent_tree *et, + struct ocfs2_alloc_context *meta_ac, + struct ocfs2_alloc_context *data_ac, + u64 blkno, u32 cpos, u32 len) +{ + int ret, num_buckets, reflink_buckets, first_inserted = 0; + u32 p_cluster, num_clusters, reflink_cpos = 0; + u64 new_blkno; + int bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode-i_sb)); + + ret = ocfs2_read_xattr_bucket(args-old_bucket, blkno); + if (ret) { + mlog_errno(ret); + goto out; + } + num_buckets = le16_to_cpu(bucket_xh(args-old_bucket)-xh_num_buckets); + ocfs2_xattr_bucket_relse(args-old_bucket); + + while (len num_buckets) { + ret = ocfs2_claim_clusters(handle, data_ac, + 1, p_cluster, num_clusters); + if (ret) { + mlog_errno(ret); + goto out; + } + + new_blkno = ocfs2_clusters_to_blocks(inode-i_sb, p_cluster); + reflink_buckets = num_buckets bpc * num_clusters ? + num_buckets : bpc * num_clusters; + + ret = ocfs2_reflink_xattr_bucket(handle,
[Ocfs2-devel] [PATCH v2] ocfs2: Make xattr reflink work with new local alloc reservation.
On 06/18/2010 11:02 AM, Tao Ma wrote: The new reservation code in local alloc has add the limitation that the caller should handle the case that the local alloc doesn't give use enough contiguous clusters. It make the old xattr reflink code broken. So this patch udpate the xattr reflink code so that it can handle the case that local alloc give us one cluster at a time. Signed-off-by: Tao Matao...@oralce.com oh, I used the wrong sob here. ;) Here is the updated one. Regards, Tao From 5b714e5a16c6ff912fcd720b916d4324f6151d16 Mon Sep 17 00:00:00 2001 From: Tao Ma tao...@oracle.com Date: Fri, 18 Jun 2010 10:38:48 +0800 Subject: [PATCH v2] ocfs2: Make xattr reflink work with new local alloc reservation. The new reservation code in local alloc has add the limitation that the caller should handle the case that the local alloc doesn't give use enough contiguous clusters. It make the old xattr reflink code broken. So this patch udpate the xattr reflink code so that it can handle the case that local alloc give us one cluster at a time. Signed-off-by: Tao Ma tao...@oracle.com --- fs/ocfs2/xattr.c | 125 ++--- 1 files changed, 89 insertions(+), 36 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 894734b..bc52840 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -6805,16 +6805,15 @@ out: return ret; } -static int ocfs2_reflink_xattr_buckets(handle_t *handle, +static int ocfs2_reflink_xattr_bucket(handle_t *handle, u64 blkno, u64 new_blkno, u32 clusters, + u32 *cpos, int num_buckets, struct ocfs2_alloc_context *meta_ac, struct ocfs2_alloc_context *data_ac, struct ocfs2_reflink_xattr_tree_args *args) { int i, j, ret = 0; struct super_block *sb = args-reflink-old_inode-i_sb; - u32 bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(sb)); - u32 num_buckets = clusters * bpc; int bpb = args-old_bucket-bu_blocks; struct ocfs2_xattr_value_buf vb = { .vb_access = ocfs2_journal_access, @@ -6833,14 +6832,6 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, break; } - /* -* The real bucket num in this series of blocks is stored -* in the 1st bucket. -*/ - if (i == 0) - num_buckets = le16_to_cpu( - bucket_xh(args-old_bucket)-xh_num_buckets); - ret = ocfs2_xattr_bucket_journal_access(handle, args-new_bucket, OCFS2_JOURNAL_ACCESS_CREATE); @@ -6854,6 +6845,18 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, bucket_block(args-old_bucket, j), sb-s_blocksize); + /* +* Record the start cpos so that we can use it to initialize +* our xattr tree we also set the xh_num_bucket for the new +* bucket. +*/ + if (i == 0) { + *cpos = le32_to_cpu(bucket_xh(args-new_bucket)- + xh_entries[0].xe_name_hash); + bucket_xh(args-new_bucket)-xh_num_buckets = + cpu_to_le16(num_buckets); + } + ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket); ret = ocfs2_reflink_xattr_header(handle, args-reflink, @@ -6883,6 +6886,7 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, } ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket); + ocfs2_xattr_bucket_relse(args-old_bucket); ocfs2_xattr_bucket_relse(args-new_bucket); } @@ -6891,6 +6895,74 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle, ocfs2_xattr_bucket_relse(args-new_bucket); return ret; } + +static int ocfs2_reflink_xattr_buckets(handle_t *handle, + struct inode *inode, + struct ocfs2_reflink_xattr_tree_args *args, + struct ocfs2_extent_tree *et, + struct ocfs2_alloc_context *meta_ac, + struct ocfs2_alloc_context *data_ac, + u64 blkno, u32 cpos, u32 len) +{ + int ret, num_buckets, reflink_buckets, first_inserted = 0; + u32 p_cluster, num_clusters, reflink_cpos = 0; + u64 new_blkno; + int bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode-i_sb)); + + ret = ocfs2_read_xattr_bucket(args-old_bucket, blkno); + if (ret) { + mlog_errno(ret); +
[Ocfs2-devel] [PATCH] ocfs2: make xattr extension work with new local alloc reservation.
New local alloc reservation can give us less clusters than what we want and ask us to restart the transaction, so let ocfs2_xattr_extend_allocation restart the transaction in this case. Signed-off-by: Tao Ma tao...@oracle.com --- fs/ocfs2/xattr.c | 33 + 1 files changed, 25 insertions(+), 8 deletions(-) diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index e97b348..894734b 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -709,7 +709,7 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, struct ocfs2_xattr_value_buf *vb, struct ocfs2_xattr_set_ctxt *ctxt) { - int status = 0; + int status = 0, credits; handle_t *handle = ctxt-handle; enum ocfs2_alloc_restarted why; u32 prev_clusters, logical_start = le32_to_cpu(vb-vb_xv-xr_clusters); @@ -719,6 +719,7 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, ocfs2_init_xattr_value_extent_tree(et, INODE_CACHE(inode), vb); +restarted_transaction: status = vb-vb_access(handle, INODE_CACHE(inode), vb-vb_bh, OCFS2_JOURNAL_ACCESS_WRITE); if (status 0) { @@ -735,8 +736,9 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, ctxt-data_ac, ctxt-meta_ac, why); - if (status 0) { - mlog_errno(status); + if ((status 0) (status != -EAGAIN)) { + if (status != -ENOSPC) + mlog_errno(status); goto leave; } @@ -744,11 +746,26 @@ static int ocfs2_xattr_extend_allocation(struct inode *inode, clusters_to_add -= le32_to_cpu(vb-vb_xv-xr_clusters) - prev_clusters; - /* -* We should have already allocated enough space before the transaction, -* so no need to restart. -*/ - BUG_ON(why != RESTART_NONE || clusters_to_add); + if (why != RESTART_NONE clusters_to_add) { + /* +* We can only fail in case the alloc file doesn't give up +* enough clusters. +*/ + BUG_ON(why == RESTART_META); + + mlog(0, restarting xattr value extension for %u clusters,.\n, +clusters_to_add); + credits = ocfs2_calc_extend_credits(inode-i_sb, + vb-vb_xv-xr_list, + clusters_to_add); + status = ocfs2_extend_trans(handle, credits); + if (status 0) { + status = -ENOMEM; + mlog_errno(status); + goto leave; + } + goto restarted_transaction; + } leave: -- 1.5.5 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel