Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Joel Becker
On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote:
 One way to skip a lockres in the purgelist is to list_del_init() and
 list_add_tail(). That simplifies the patch a lot.
 
 I have attached a quick  dirty patch. See if that satisfies all the
 requirements.

As far as I can tell from reading the code, the time_after()
check is because they are time ordered.  Wouldn't moving it to the end
violate that?

Joel

-- 

All alone at the end of the evening
 When the bright lights have faded to blue.
 I was thinking about a woman who had loved me
 And I never knew

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Srinivas Eeda

On 6/17/2010 1:32 AM, Joel Becker wrote:

On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote:
  

One way to skip a lockres in the purgelist is to list_del_init() and
list_add_tail(). That simplifies the patch a lot.

I have attached a quick  dirty patch. See if that satisfies all the
requirements.



As far as I can tell from reading the code, the time_after()
check is because they are time ordered.  Wouldn't moving it to the end
violate that?
  

right. that's why I didn't want to move used lockres to tail :)

Joel

  
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Wengang Wang
On 10-06-17 01:53, Srinivas Eeda wrote:
 On 6/15/2010 11:06 PM, Wengang Wang wrote:
 still the question.
 If you have sent DEREF request to the master, and the lockres became in-use
 again, then the lockres remains in the hash table and also in the purge list.
 So
 Yes, that's a possibility. But there is not much we could do to
 cover that window other than making the non master nodes to avoid
 such races. Patch 2/2 fixes one such race.

Yes, we should make non master nodes to avoid such races. But you only
fixed one of such races by patch 2/2 :).
And probably we can't make sure how many such races there. A know case is
dlm_mig_lockres_handler(). We dropped dlm-spinlock and res-spinlock after
dlm_lookup_lockres(). Here it can be set DROPPING_REF state in dlm_thread().
dlm_thread() then drop the spinlocks and set deref msg. Before dlm_thread()
take the spinlocks back, dlm_mig_lockres_handler() continues, A lock(s) is
added to the lockres. The lockres became in use then. dlm_thread() take
back the spinlocks, found the lockres is in use, keep it in hashtable and in
purge list.

Your patch 2/2 fixes that problem well.
So far I have no good idea to fix all such potential races..

regards,
wengang.

 1) If this node is the last ref, there is a possibility that the master
 purged the lockres after receiving DEREF request from this node. In this
 case, when this node does dlmlock_remote(), the lockres won't be found on the
 master. How to deal with it?
 patch 2/2 fixes this race. dlm_get_lock_resource will either wait
 for the lockres to get purged and starts everything fresh or marks
 the lockres in use so dlm_thread won't purge it.
 2) The lockres on this node is going to be purged again, it means it will 
 send
 secondary DEREFs to the master. This is not good I think.
 right, not a good idea to send deref again. We have to fix those cases.
 A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after
 sending a DEREF request againt this lockres. Also redo master reqeust
 before locking on it.
 if you are referring to the hole in dlmlock_remote, patch 2/2 fixes
 it. Please review that patch and let me know :)
 Regards,
 wengang.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Sunil Mushran

On 06/17/2010 01:35 AM, Srinivas Eeda wrote:

On 6/17/2010 1:32 AM, Joel Becker wrote:

On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote:
   

One way to skip a lockres in the purgelist is to list_del_init() and
list_add_tail(). That simplifies the patch a lot.

I have attached a quick  dirty patch. See if that satisfies all the
requirements.
 


As far as I can tell from reading the code, the time_after()
check is because they are time ordered.  Wouldn't moving it to the end
violate that?
   

right. that's why I didn't want to move used lockres to tail :)



No. There is no need for time ordering. Or, am I missing something?

We delay the purge incase the file system changes its mind and wants
to reuse it. By delaying, we hold onto the mastery information. That's it.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Srinivas Eeda

On 6/17/2010 7:48 AM, Sunil Mushran wrote:

On 06/17/2010 01:35 AM, Srinivas Eeda wrote:

On 6/17/2010 1:32 AM, Joel Becker wrote:

On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote:
  

One way to skip a lockres in the purgelist is to list_del_init() and
list_add_tail(). That simplifies the patch a lot.

I have attached a quick  dirty patch. See if that satisfies all the
requirements.



As far as I can tell from reading the code, the time_after()
check is because they are time ordered.  Wouldn't moving it to the end
violate that?
  

right. that's why I didn't want to move used lockres to tail :)



No. There is no need for time ordering. Or, am I missing something?

We delay the purge incase the file system changes its mind and wants
to reuse it. By delaying, we hold onto the mastery information. That's it.

So, should we preserve this? or purge the lockres  when it's on purge 
list? OR we could just update the last_used and move it to end of the 
list if it can be purged.
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Srinivas Eeda
Sunil,

as of now, there is still a window in dlm_get_lock_resource, where it 
finds the lockres but it doesn't protect it from getting purged. Second 
patch fixes this by marking it in_use, can you please review that one as 
well.

Thanks,
--Srini

On 6/17/2010 8:06 AM, Sunil Mushran wrote:
 On 06/15/2010 11:06 PM, Wengang Wang wrote:
 still the question.
 If you have sent DEREF request to the master, and the lockres became 
 in-use
 again, then the lockres remains in the hash table and also in the 
 purge list.
 So
 1) If this node is the last ref, there is a possibility that the master
 purged the lockres after receiving DEREF request from this node. In this
 case, when this node does dlmlock_remote(), the lockres won't be 
 found on the
 master. How to deal with it?

 2) The lockres on this node is going to be purged again, it means it 
 will send
 secondary DEREFs to the master. This is not good I think.

 A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after
 sending a DEREF request againt this lockres. Also redo master reqeust
 before locking on it.


 The fix we are working towards is to ensure that we set
 DLM_LOCK_RES_DROPPING_REF once we are determined
 to purge the lockres. As in, we should not let go of the spinlock
 before we have either set the flag or decided against purging
 that resource.

 Once the flag is set, new users looking up the resource via
 dlm_get_lock_resource() will notice the flag and will then wait
 for that flag to be cleared before looking up the lockres hash
 again. If all goes well, the lockres will not be found (because it
 has since been unhashed) and it will be forced to go thru the
 full mastery process.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Joel Becker
On Thu, Jun 17, 2010 at 07:48:38AM -0700, Sunil Mushran wrote:
 On 06/17/2010 01:35 AM, Srinivas Eeda wrote:
 On 6/17/2010 1:32 AM, Joel Becker wrote:
 As far as I can tell from reading the code, the time_after()
 check is because they are time ordered.  Wouldn't moving it to the end
 violate that?
 right. that's why I didn't want to move used lockres to tail :)
 
 
 No. There is no need for time ordering. Or, am I missing something?
 
 We delay the purge incase the file system changes its mind and wants
 to reuse it. By delaying, we hold onto the mastery information. That's it.

The comment is right there:

/* Since resources are added to the purge list
 * in tail order, we can stop at the first
 * unpurgable resource -- anyone added after
 * him will have a greater last_used value */
break;

So we're going to break out of this loop as soon we see a later time.
Anything you've moved to the back will be ignored until enough time
passes that the things in front of it are candidates for purge.
You could, of course, just change the 'break' to a 'continue'
and find those, but I kind of like the short-circuit behavior.  I don't
know how long this list is, but walking a whole bunch of not yet
lockreses just to hopefully find one at the end seems silly.
I think it is better to leave the busy ones at the front of the
list and just walk past them if they can't be dealt with now.

Joel

-- 

Life's Little Instruction Book #157 

Take time to smell the roses.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Sunil Mushran
We don't have to update the last used to move it to tail.  If it is  
now in use, means it is about to be taken out of the purgelist. We  
just need to move it from our way for the time being.

On Jun 17, 2010, at 9:55 AM, Srinivas Eeda srinivas.e...@oracle.com  
wrote:

 On 6/17/2010 7:48 AM, Sunil Mushran wrote:

 On 06/17/2010 01:35 AM, Srinivas Eeda wrote:

 On 6/17/2010 1:32 AM, Joel Becker wrote:

 On Wed, Jun 16, 2010 at 06:44:43PM -0700, Sunil Mushran wrote:

 One way to skip a lockres in the purgelist is to list_del_init()  
 and
 list_add_tail(). That simplifies the patch a lot.

 I have attached a quick  dirty patch. See if that satisfies all  
 the
 requirements.


As far as I can tell from reading the code, the time_after()
 check is because they are time ordered.  Wouldn't moving it to  
 the end
 violate that?

 right. that's why I didn't want to move used lockres to tail :)


 No. There is no need for time ordering. Or, am I missing something?

 We delay the purge incase the file system changes its mind and wants
 to reuse it. By delaying, we hold onto the mastery information.  
 That's it.

 So, should we preserve this? or purge the lockres  when it's on  
 purge list? OR we could just update the last_used and move it to end  
 of the list if it can be purged.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: check dlm_state under spinlock

2010-06-17 Thread Sunil Mushran
On 06/15/2010 09:08 PM, Wengang Wang wrote:
 We should check dlm-dlm_state under dlm-spinlock though maybe in this case 
 it
 doesn't hurt.


NAK.

dlm-dlm_state is protected by dlm_domain_lock which is held at that time.

 /* NOTE: Next three are protected by dlm_domain_lock */
 struct kref dlm_refs;
 enum dlm_ctxt_state dlm_state;
 unsigned int num_joins;


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 2/2] ocfs2: o2dlm fix race in purge lockres and newlock (orabug 9094491)

2010-06-17 Thread Sunil Mushran
This patch looks ok on the surface. Should be usable. But checking into
the repo is another matter.

My problem is that this flag is very much like inflight_locks but
is not the same. inflight_locks are taken only on lockres' that are
mastered by the node. This new flag does the same but for non
mastered locks too. A better solution will be to merge the two.
And that means cleaning up inflight_locks.

The reason for the NAK for check in is that the code is adding
more messiness to an area that is already very messy.

Sunil

On 06/15/2010 09:43 PM, Srinivas Eeda wrote:
 This patch fixes the following hole.
 dlmlock tries to create a new lock on a lockres that is on purge list. It 
 calls
 dlm_get_lockresource and later adds a lock to blocked list. But in this 
 window,
 dlm_thread can purge the lockres and unhash it. This will cause a BUG, as when
 the AST comes back from the master lockres is not found

 This patch marks the lockres with a new state DLM_LOCK_RES_IN_USE which would
 protect lockres from dlm_thread purging it.

 Signed-off-by: Srinivas Eedasrinivas.e...@oracle.com
 ---
   fs/ocfs2/dlm/dlmcommon.h |1 +
   fs/ocfs2/dlm/dlmlock.c   |4 
   fs/ocfs2/dlm/dlmmaster.c |5 -
   fs/ocfs2/dlm/dlmthread.c |1 +
   4 files changed, 10 insertions(+), 1 deletions(-)

 diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h
 index 0102be3..6cf3c08 100644
 --- a/fs/ocfs2/dlm/dlmcommon.h
 +++ b/fs/ocfs2/dlm/dlmcommon.h
 @@ -279,6 +279,7 @@ static inline void __dlm_set_joining_node(struct dlm_ctxt 
 *dlm,
   #define DLM_LOCK_RES_DIRTY0x0008
   #define DLM_LOCK_RES_IN_PROGRESS  0x0010
   #define DLM_LOCK_RES_MIGRATING0x0020
 +#define DLM_LOCK_RES_IN_USE   0x0100
   #define DLM_LOCK_RES_DROPPING_REF 0x0040
   #define DLM_LOCK_RES_BLOCK_DIRTY  0x1000
   #define DLM_LOCK_RES_SETREF_INPROG0x2000
 diff --git a/fs/ocfs2/dlm/dlmlock.c b/fs/ocfs2/dlm/dlmlock.c
 index 777..501ac40 100644
 --- a/fs/ocfs2/dlm/dlmlock.c
 +++ b/fs/ocfs2/dlm/dlmlock.c
 @@ -134,6 +134,8 @@ static enum dlm_status dlmlock_master(struct dlm_ctxt 
 *dlm,
   if (status != DLM_NORMAL
   lock-ml.node != dlm-node_num) {
   /* erf.  state changed after lock was dropped. */
 + /* DLM_LOCK_RES_IN_USE is set in dlm_get_lock_resource */
 + res-state= ~DLM_LOCK_RES_IN_USE;
   spin_unlock(res-spinlock);
   dlm_error(status);
   return status;
 @@ -180,6 +182,7 @@ static enum dlm_status dlmlock_master(struct dlm_ctxt 
 *dlm,
   kick_thread = 1;
   }
   }
 + res-state= ~DLM_LOCK_RES_IN_USE;
   /* reduce the inflight count, this may result in the lockres
* being purged below during calc_usage */
   if (lock-ml.node == dlm-node_num)
 @@ -246,6 +249,7 @@ static enum dlm_status dlmlock_remote(struct dlm_ctxt 
 *dlm,

   spin_lock(res-spinlock);
   res-state= ~DLM_LOCK_RES_IN_PROGRESS;
 + res-state= ~DLM_LOCK_RES_IN_USE;
   lock-lock_pending = 0;
   if (status != DLM_NORMAL) {
   if (status == DLM_RECOVERING
 diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
 index 9289b43..f0f2d97 100644
 --- a/fs/ocfs2/dlm/dlmmaster.c
 +++ b/fs/ocfs2/dlm/dlmmaster.c
 @@ -719,6 +719,7 @@ lookup:
   if (tmpres) {
   int dropping_ref = 0;

 + tmpres-state |= DLM_LOCK_RES_IN_USE;
   spin_unlock(dlm-spinlock);

   spin_lock(tmpres-spinlock);
 @@ -731,8 +732,10 @@ lookup:
   if (tmpres-owner == dlm-node_num) {
   BUG_ON(tmpres-state  DLM_LOCK_RES_DROPPING_REF);
   dlm_lockres_grab_inflight_ref(dlm, tmpres);
 - } else if (tmpres-state  DLM_LOCK_RES_DROPPING_REF)
 + } else if (tmpres-state  DLM_LOCK_RES_DROPPING_REF) {
 + tmpres-state= ~DLM_LOCK_RES_IN_USE;
   dropping_ref = 1;
 + }
   spin_unlock(tmpres-spinlock);

   /* wait until done messaging the master, drop our ref to allow
 diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c
 index fb0be6c..2b82e0b 100644
 --- a/fs/ocfs2/dlm/dlmthread.c
 +++ b/fs/ocfs2/dlm/dlmthread.c
 @@ -93,6 +93,7 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res)
   int __dlm_lockres_unused(struct dlm_lock_resource *res)
   {
   if (!__dlm_lockres_has_locks(res)
 + !(res-state  DLM_LOCK_RES_IN_USE)
   (list_empty(res-dirty)  !(res-state  DLM_LOCK_RES_DIRTY))) {
   /* try not to scan the bitmap unless the first two
* conditions are already true */



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 1/2] ocfs2 fix o2dlm dlm run purgelist

2010-06-17 Thread Wengang Wang
On 10-06-17 08:06, Sunil Mushran wrote:
 On 06/15/2010 11:06 PM, Wengang Wang wrote:
 still the question.
 If you have sent DEREF request to the master, and the lockres became in-use
 again, then the lockres remains in the hash table and also in the purge list.
 So
 1) If this node is the last ref, there is a possibility that the master
 purged the lockres after receiving DEREF request from this node. In this
 case, when this node does dlmlock_remote(), the lockres won't be found on the
 master. How to deal with it?
 
 2) The lockres on this node is going to be purged again, it means it will 
 send
 secondary DEREFs to the master. This is not good I think.
 
 A thought is setting lockres-owner to DLM_LOCK_RES_OWNER_UNKNOWN after
 sending a DEREF request againt this lockres. Also redo master reqeust
 before locking on it.
 
 The fix we are working towards is to ensure that we set
 DLM_LOCK_RES_DROPPING_REF once we are determined
 to purge the lockres. As in, we should not let go of the spinlock
 before we have either set the flag or decided against purging
 that resource.
 
 Once the flag is set, new users looking up the resource via
 dlm_get_lock_resource() will notice the flag and will then wait
 for that flag to be cleared before looking up the lockres hash
 again. If all goes well, the lockres will not be found (because it
 has since been unhashed) and it will be forced to go thru the
 full mastery process.

That is ideal.
In many cases the lockres is not got via dlm_get_lock_resource(), but
via dlm_lookup_lockres()/__dlm_lookup_lockres, which doesn't set the new
IN-USE state, directly. dlm_lookup_lockres() takes and drops
dlm-spinlock. And some of caller of __dlm_lookup_lockres() drops the
spinlock as soon as it got the lockres. Such paths access the lockres
later after dropping dlm-spinlock and res-spinlock.
So there is a window that dlm_thread() get a chance to take the
dlm-spinlock and res-spinlock and set the DROPPING_REF state.
So whether new users can get the lockres depends on how new it is. If
finds the lockres after DROPPING_REF state is set, sure it works well. But
if it find it before DROPPING_REF is set, it won't protect the lockres
from purging since even it gets the lockres, the lockres can still in
unused state.

regards,
wengang.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: check dlm_state under spinlock

2010-06-17 Thread Wengang Wang
On 10-06-17 18:35, Sunil Mushran wrote:
 On 06/15/2010 09:08 PM, Wengang Wang wrote:
 We should check dlm-dlm_state under dlm-spinlock though maybe in this case 
 it
 doesn't hurt.
 
 NAK.
 
 dlm-dlm_state is protected by dlm_domain_lock which is held at that time.
 
 /* NOTE: Next three are protected by dlm_domain_lock */
 struct kref dlm_refs;
 enum dlm_ctxt_state dlm_state;
 unsigned int num_joins;

Yes, that's ture. thanks.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 0/2] ocfs2: make xattr work with new local alloc reservation.

2010-06-17 Thread Tao Ma
Hi all,
The old xattr codes deem that we can allocate enough contiguous 
clusters from the allocator, it works with the old local alloc 
allocator. But as Mark has added reservation code to ocfs2, now it isn't 
the good hypothesis any more. So these 2 patches try to let xattr work 
with it.

Regards,
Tao

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH] ocfs2: Make xattr reflink work with new local alloc reservation.

2010-06-17 Thread Tao Ma
The new reservation code in local alloc has add the limitation
that the caller should handle the case that the local alloc
doesn't give use enough contiguous clusters. It make the old
xattr reflink code broken.

So this patch udpate the xattr reflink code so that it can
handle the case that local alloc give us one cluster at a time.

Signed-off-by: Tao Ma tao...@oralce.com
---
 fs/ocfs2/xattr.c |  125 ++---
 1 files changed, 89 insertions(+), 36 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 894734b..bc52840 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -6805,16 +6805,15 @@ out:
return ret;
 }
 
-static int ocfs2_reflink_xattr_buckets(handle_t *handle,
+static int ocfs2_reflink_xattr_bucket(handle_t *handle,
u64 blkno, u64 new_blkno, u32 clusters,
+   u32 *cpos, int num_buckets,
struct ocfs2_alloc_context *meta_ac,
struct ocfs2_alloc_context *data_ac,
struct ocfs2_reflink_xattr_tree_args *args)
 {
int i, j, ret = 0;
struct super_block *sb = args-reflink-old_inode-i_sb;
-   u32 bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(sb));
-   u32 num_buckets = clusters * bpc;
int bpb = args-old_bucket-bu_blocks;
struct ocfs2_xattr_value_buf vb = {
.vb_access = ocfs2_journal_access,
@@ -6833,14 +6832,6 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
break;
}
 
-   /*
-* The real bucket num in this series of blocks is stored
-* in the 1st bucket.
-*/
-   if (i == 0)
-   num_buckets = le16_to_cpu(
-   bucket_xh(args-old_bucket)-xh_num_buckets);
-
ret = ocfs2_xattr_bucket_journal_access(handle,
args-new_bucket,
OCFS2_JOURNAL_ACCESS_CREATE);
@@ -6854,6 +6845,18 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
   bucket_block(args-old_bucket, j),
   sb-s_blocksize);
 
+   /*
+* Record the start cpos so that we can use it to initialize
+* our xattr tree we also set the xh_num_bucket for the new
+* bucket.
+*/
+   if (i == 0) {
+   *cpos = le32_to_cpu(bucket_xh(args-new_bucket)-
+   xh_entries[0].xe_name_hash);
+   bucket_xh(args-new_bucket)-xh_num_buckets =
+   cpu_to_le16(num_buckets);
+   }
+
ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket);
 
ret = ocfs2_reflink_xattr_header(handle, args-reflink,
@@ -6883,6 +6886,7 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
}
 
ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket);
+
ocfs2_xattr_bucket_relse(args-old_bucket);
ocfs2_xattr_bucket_relse(args-new_bucket);
}
@@ -6891,6 +6895,74 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
ocfs2_xattr_bucket_relse(args-new_bucket);
return ret;
 }
+
+static int ocfs2_reflink_xattr_buckets(handle_t *handle,
+   struct inode *inode,
+   struct ocfs2_reflink_xattr_tree_args *args,
+   struct ocfs2_extent_tree *et,
+   struct ocfs2_alloc_context *meta_ac,
+   struct ocfs2_alloc_context *data_ac,
+   u64 blkno, u32 cpos, u32 len)
+{
+   int ret, num_buckets, reflink_buckets, first_inserted = 0;
+   u32 p_cluster, num_clusters, reflink_cpos = 0;
+   u64 new_blkno;
+   int bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode-i_sb));
+
+   ret = ocfs2_read_xattr_bucket(args-old_bucket, blkno);
+   if (ret) {
+   mlog_errno(ret);
+   goto out;
+   }
+   num_buckets = le16_to_cpu(bucket_xh(args-old_bucket)-xh_num_buckets);
+   ocfs2_xattr_bucket_relse(args-old_bucket);
+
+   while (len  num_buckets) {
+   ret = ocfs2_claim_clusters(handle, data_ac,
+  1, p_cluster, num_clusters);
+   if (ret) {
+   mlog_errno(ret);
+   goto out;
+   }
+
+   new_blkno = ocfs2_clusters_to_blocks(inode-i_sb, p_cluster);
+   reflink_buckets = num_buckets  bpc * num_clusters ?
+ num_buckets : bpc * num_clusters;
+
+   ret = ocfs2_reflink_xattr_bucket(handle, 

[Ocfs2-devel] [PATCH v2] ocfs2: Make xattr reflink work with new local alloc reservation.

2010-06-17 Thread Tao Ma
On 06/18/2010 11:02 AM, Tao Ma wrote:
 The new reservation code in local alloc has add the limitation
 that the caller should handle the case that the local alloc
 doesn't give use enough contiguous clusters. It make the old
 xattr reflink code broken.
 
 So this patch udpate the xattr reflink code so that it can
 handle the case that local alloc give us one cluster at a time.
 
 Signed-off-by: Tao Matao...@oralce.com
oh, I used the wrong sob here. ;) Here is the updated one.

Regards,
Tao

From 5b714e5a16c6ff912fcd720b916d4324f6151d16 Mon Sep 17 00:00:00 2001
From: Tao Ma tao...@oracle.com
Date: Fri, 18 Jun 2010 10:38:48 +0800
Subject: [PATCH v2] ocfs2: Make xattr reflink work with new local alloc 
reservation.

The new reservation code in local alloc has add the limitation
that the caller should handle the case that the local alloc
doesn't give use enough contiguous clusters. It make the old
xattr reflink code broken.

So this patch udpate the xattr reflink code so that it can
handle the case that local alloc give us one cluster at a time.

Signed-off-by: Tao Ma tao...@oracle.com
---
 fs/ocfs2/xattr.c |  125 ++---
 1 files changed, 89 insertions(+), 36 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 894734b..bc52840 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -6805,16 +6805,15 @@ out:
return ret;
 }
 
-static int ocfs2_reflink_xattr_buckets(handle_t *handle,
+static int ocfs2_reflink_xattr_bucket(handle_t *handle,
u64 blkno, u64 new_blkno, u32 clusters,
+   u32 *cpos, int num_buckets,
struct ocfs2_alloc_context *meta_ac,
struct ocfs2_alloc_context *data_ac,
struct ocfs2_reflink_xattr_tree_args *args)
 {
int i, j, ret = 0;
struct super_block *sb = args-reflink-old_inode-i_sb;
-   u32 bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(sb));
-   u32 num_buckets = clusters * bpc;
int bpb = args-old_bucket-bu_blocks;
struct ocfs2_xattr_value_buf vb = {
.vb_access = ocfs2_journal_access,
@@ -6833,14 +6832,6 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
break;
}
 
-   /*
-* The real bucket num in this series of blocks is stored
-* in the 1st bucket.
-*/
-   if (i == 0)
-   num_buckets = le16_to_cpu(
-   bucket_xh(args-old_bucket)-xh_num_buckets);
-
ret = ocfs2_xattr_bucket_journal_access(handle,
args-new_bucket,
OCFS2_JOURNAL_ACCESS_CREATE);
@@ -6854,6 +6845,18 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
   bucket_block(args-old_bucket, j),
   sb-s_blocksize);
 
+   /*
+* Record the start cpos so that we can use it to initialize
+* our xattr tree we also set the xh_num_bucket for the new
+* bucket.
+*/
+   if (i == 0) {
+   *cpos = le32_to_cpu(bucket_xh(args-new_bucket)-
+   xh_entries[0].xe_name_hash);
+   bucket_xh(args-new_bucket)-xh_num_buckets =
+   cpu_to_le16(num_buckets);
+   }
+
ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket);
 
ret = ocfs2_reflink_xattr_header(handle, args-reflink,
@@ -6883,6 +6886,7 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
}
 
ocfs2_xattr_bucket_journal_dirty(handle, args-new_bucket);
+
ocfs2_xattr_bucket_relse(args-old_bucket);
ocfs2_xattr_bucket_relse(args-new_bucket);
}
@@ -6891,6 +6895,74 @@ static int ocfs2_reflink_xattr_buckets(handle_t *handle,
ocfs2_xattr_bucket_relse(args-new_bucket);
return ret;
 }
+
+static int ocfs2_reflink_xattr_buckets(handle_t *handle,
+   struct inode *inode,
+   struct ocfs2_reflink_xattr_tree_args *args,
+   struct ocfs2_extent_tree *et,
+   struct ocfs2_alloc_context *meta_ac,
+   struct ocfs2_alloc_context *data_ac,
+   u64 blkno, u32 cpos, u32 len)
+{
+   int ret, num_buckets, reflink_buckets, first_inserted = 0;
+   u32 p_cluster, num_clusters, reflink_cpos = 0;
+   u64 new_blkno;
+   int bpc = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode-i_sb));
+
+   ret = ocfs2_read_xattr_bucket(args-old_bucket, blkno);
+   if (ret) {
+   mlog_errno(ret);
+

[Ocfs2-devel] [PATCH] ocfs2: make xattr extension work with new local alloc reservation.

2010-06-17 Thread Tao Ma
New local alloc reservation can give us less clusters than
what we want and ask us to restart the transaction, so let
ocfs2_xattr_extend_allocation restart the transaction in
this case.

Signed-off-by: Tao Ma tao...@oracle.com
---
 fs/ocfs2/xattr.c |   33 +
 1 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index e97b348..894734b 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -709,7 +709,7 @@ static int ocfs2_xattr_extend_allocation(struct inode 
*inode,
 struct ocfs2_xattr_value_buf *vb,
 struct ocfs2_xattr_set_ctxt *ctxt)
 {
-   int status = 0;
+   int status = 0, credits;
handle_t *handle = ctxt-handle;
enum ocfs2_alloc_restarted why;
u32 prev_clusters, logical_start = le32_to_cpu(vb-vb_xv-xr_clusters);
@@ -719,6 +719,7 @@ static int ocfs2_xattr_extend_allocation(struct inode 
*inode,
 
ocfs2_init_xattr_value_extent_tree(et, INODE_CACHE(inode), vb);
 
+restarted_transaction:
status = vb-vb_access(handle, INODE_CACHE(inode), vb-vb_bh,
  OCFS2_JOURNAL_ACCESS_WRITE);
if (status  0) {
@@ -735,8 +736,9 @@ static int ocfs2_xattr_extend_allocation(struct inode 
*inode,
 ctxt-data_ac,
 ctxt-meta_ac,
 why);
-   if (status  0) {
-   mlog_errno(status);
+   if ((status  0)  (status != -EAGAIN)) {
+   if (status != -ENOSPC)
+   mlog_errno(status);
goto leave;
}
 
@@ -744,11 +746,26 @@ static int ocfs2_xattr_extend_allocation(struct inode 
*inode,
 
clusters_to_add -= le32_to_cpu(vb-vb_xv-xr_clusters) - prev_clusters;
 
-   /*
-* We should have already allocated enough space before the transaction,
-* so no need to restart.
-*/
-   BUG_ON(why != RESTART_NONE || clusters_to_add);
+   if (why != RESTART_NONE  clusters_to_add) {
+   /*
+* We can only fail in case the alloc file doesn't give up
+* enough clusters.
+*/
+   BUG_ON(why == RESTART_META);
+
+   mlog(0, restarting xattr value extension for %u clusters,.\n,
+clusters_to_add);
+   credits = ocfs2_calc_extend_credits(inode-i_sb,
+   vb-vb_xv-xr_list,
+   clusters_to_add);
+   status = ocfs2_extend_trans(handle, credits);
+   if (status  0) {
+   status = -ENOMEM;
+   mlog_errno(status);
+   goto leave;
+   }
+   goto restarted_transaction;
+   }
 
 leave:
 
-- 
1.5.5


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel