Re: [Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting

2011-07-07 Thread Srinivas Eeda
On 7/5/2011 11:17 PM, Sunil Mushran wrote:
 2. All nodes have to scan all slots. Even live slots. I remember we 
 did for
 a reason. And that reason should be in the comment in the patch written
 by Srini.
When a node unlinks a file it inserts an entry into it's own orphan 
slot. If another node is the last one to close the file and dentry got 
flushed then it will not do the cleanup as it doesn't know the file was 
orphaned. The file will remain in the orphan slot  till the node umounts 
and the same slot is reused again. To overcome this problem a node has 
to rescan all slots(including live slots) and try to do the cleanup.



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting

2011-07-07 Thread Sunil Mushran
On 07/06/2011 11:19 PM, Srinivas Eeda wrote:
 On 7/5/2011 11:17 PM, Sunil Mushran wrote:
 2. All nodes have to scan all slots. Even live slots. I remember we
 did for
 a reason. And that reason should be in the comment in the patch written
 by Srini.
 When a node unlinks a file it inserts an entry into it's own orphan
 slot. If another node is the last one to close the file and dentry got
 flushed then it will not do the cleanup as it doesn't know the file was
 orphaned. The file will remain in the orphan slot  till the node umounts
 and the same slot is reused again. To overcome this problem a node has
 to rescan all slots(including live slots) and try to do the cleanup.

The qs is not why are we scanning all live slots on all nodes. As in,
why not just recover the local slot. There was a reason for that.
Yes, we have to recover unused slots for the reason listed previously.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] ocfs2: drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans

2011-07-07 Thread Sunil Mushran
This is too hacky for mainline. May work as a temp fix only.

This assumes that we only need to recover orphans in our own slot.
If that is the case, then if nothing else, we can remove the orphan
scan lock. We added the lock so as to ensure all nodes get a chance
to replay it. If all nodes are doing there own, they why do we need
the lock.

On 07/05/2011 10:11 PM, Wengang Wang wrote:
 drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans.

 Signed-off-by: Wengang Wangwen.gang.w...@oracle.com
 ---
   fs/ocfs2/journal.c |   40 ++--
   1 files changed, 34 insertions(+), 6 deletions(-)

 diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
 index 7994823..e3b8340 100644
 --- a/fs/ocfs2/journal.c
 +++ b/fs/ocfs2/journal.c
 @@ -56,6 +56,7 @@
   DEFINE_SPINLOCK(trans_inc_lock);

   #define ORPHAN_SCAN_SCHEDULE_TIMEOUT 30
 +#define ORPHAN_SCAN_ITEMS_PER_LOOP 200

   static int ocfs2_force_read_journal(struct inode *inode);
   static int ocfs2_recover_node(struct ocfs2_super *osb,
 @@ -1949,6 +1950,8 @@ void ocfs2_orphan_scan_start(struct ocfs2_super *osb)
   struct ocfs2_orphan_filldir_priv {
   struct inode*head;
   struct ocfs2_super  *osb;
 + int count;
 + int for_live:1;
   };

   static int ocfs2_orphan_filldir(void *priv, const char *name, int name_len,
 @@ -1957,6 +1960,9 @@ static int ocfs2_orphan_filldir(void *priv, const char 
 *name, int name_len,
   struct ocfs2_orphan_filldir_priv *p = priv;
   struct inode *iter;

 + if (p-for_live)
 + BUG_ON(p-count  ORPHAN_SCAN_ITEMS_PER_LOOP);
 +
   if (name_len == 1  !strncmp(., name, 1))
   return 0;
   if (name_len == 2  !strncmp(.., name, 2))
 @@ -1974,20 +1980,38 @@ static int ocfs2_orphan_filldir(void *priv, const 
 char *name, int name_len,
   OCFS2_I(iter)-ip_next_orphan = p-head;
   p-head = iter;

 + /*
 +  * we fill at most ORPHAN_SCAN_ITEMS_PER_LOOP direntry in a loop with
 +  * the orphan_dir.i_mutex locked.
 +  */
 + if (p-for_live) {
 + if (++p-count == ORPHAN_SCAN_ITEMS_PER_LOOP)
 + return -EAGAIN;
 + }
   return 0;
   }

   static int ocfs2_queue_orphans(struct ocfs2_super *osb,
  int slot,
 -struct inode **head)
 +struct inode **head,
 +loff_t *pos)
   {
   int status;
   struct inode *orphan_dir_inode = NULL;
   struct ocfs2_orphan_filldir_priv priv;
 - loff_t pos = 0;
 + unsigned int node_num;

   priv.osb = osb;
   priv.head = *head;
 + priv.count = 0;
 +
 + spin_lock(osb-osb_lock);
 + status = ocfs2_slot_to_node_num_locked(osb, slot,node_num);
 + spin_unlock(osb-osb_lock);
 + if (!status  node_num == osb-node_num)
 + priv.for_live = true;
 + else
 + priv.for_live = false;

You cannot return (save) pos and yet lookup the slotnum again. That's
not the right order. Too hackish. All these functions will need to be
re-prototyped if your approach is what we use.


   orphan_dir_inode = ocfs2_get_system_file_inode(osb,
  ORPHAN_DIR_SYSTEM_INODE,
 @@ -2005,9 +2029,9 @@ static int ocfs2_queue_orphans(struct ocfs2_super *osb,
   goto out;
   }

 - status = ocfs2_dir_foreach(orphan_dir_inode,pos,priv,
 + status = ocfs2_dir_foreach(orphan_dir_inode, pos,priv,
  ocfs2_orphan_filldir);
 - if (status) {
 + if (status  status != -EAGAIN) {
   mlog_errno(status);
   goto out_cluster;
   }
 @@ -2083,16 +2107,18 @@ static int ocfs2_recover_orphans(struct ocfs2_super 
 *osb,
   struct inode *inode = NULL;
   struct inode *iter;
   struct ocfs2_inode_info *oi;
 + loff_t pos = 0;

   trace_ocfs2_recover_orphans(slot);

 +cnt_scan:
   ocfs2_mark_recovering_orphan_dir(osb, slot);
 - ret = ocfs2_queue_orphans(osb, slot,inode);
 + ret = ocfs2_queue_orphans(osb, slot,inode,pos);
   ocfs2_clear_recovering_orphan_dir(osb, slot);

The slot is being marked and cleared multiple times. I am unsure what the
ramifications are, but it does not look right.

   /* Error here should be noted, but we want to continue with as
* many queued inodes as we've got. */
 - if (ret)
 + if (ret  ret != -EAGAIN)
   mlog_errno(ret);

   while (inode) {
 @@ -2119,6 +2145,8 @@ static int ocfs2_recover_orphans(struct ocfs2_super 
 *osb,
   inode = iter;
   }

 + if (ret == -EAGAIN)
 + goto cnt_scan;
   return ret;
   }



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 2/3] ocfs2: let ocfs2_dir_foreach return the error turned by filldir

2011-07-07 Thread Wengang Wang
On 11-07-07 13:09, Sunil Mushran wrote:
 On 07/05/2011 09:40 PM, Wengang Wang wrote:
 Let ocfs2_dir_foreach return the error turned by filldir.
 
 Signed-off-by: Wengang Wangwen.gang.w...@oracle.com
 ---
   fs/ocfs2/dir.c |2 ++
   1 files changed, 2 insertions(+), 0 deletions(-)
 
 diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
 index 8582e3f..6d7560a 100644
 --- a/fs/ocfs2/dir.c
 +++ b/fs/ocfs2/dir.c
 @@ -2005,6 +2005,8 @@ int ocfs2_dir_foreach(struct inode *inode, loff_t 
 *f_pos, void *priv,
 
  if (ret  0)
  ret = -EIO;
 +if (!ret)
 +ret = filldir_err;
 
  return 0;
   }
 
 That's not enough. Shouldn't it be return ret too.

If neccesary, We can return both ret and filldir_err.
 
 Have you triggered this error manually? I have never seen reports of it
 before.

No error here. This change is only needed for the following patch:
[PATCH 3/3] ocfs2: drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans.

thanks,
wengang.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] ocfs2: drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans

2011-07-07 Thread Wengang Wang
On 11-07-07 14:48, Sunil Mushran wrote:
 This is too hacky for mainline. May work as a temp fix only.

:D

 
 This assumes that we only need to recover orphans in our own slot.
 If that is the case, then if nothing else, we can remove the orphan
 scan lock. We added the lock so as to ensure all nodes get a chance
 to replay it. If all nodes are doing there own, they why do we need
 the lock.

Actually, for live nodes, yes we only recover our own slot. For the
dead slots, we need the orphan scan lock.

 
 On 07/05/2011 10:11 PM, Wengang Wang wrote:
 drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans.
 
 Signed-off-by: Wengang Wangwen.gang.w...@oracle.com
 ---
   fs/ocfs2/journal.c |   40 ++--
   1 files changed, 34 insertions(+), 6 deletions(-)
 
 diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
 index 7994823..e3b8340 100644
 --- a/fs/ocfs2/journal.c
 +++ b/fs/ocfs2/journal.c
 @@ -56,6 +56,7 @@
   DEFINE_SPINLOCK(trans_inc_lock);
 
   #define ORPHAN_SCAN_SCHEDULE_TIMEOUT 30
 +#define ORPHAN_SCAN_ITEMS_PER_LOOP 200
 
   static int ocfs2_force_read_journal(struct inode *inode);
   static int ocfs2_recover_node(struct ocfs2_super *osb,
 @@ -1949,6 +1950,8 @@ void ocfs2_orphan_scan_start(struct ocfs2_super *osb)
   struct ocfs2_orphan_filldir_priv {
  struct inode*head;
  struct ocfs2_super  *osb;
 +int count;
 +int for_live:1;
   };
 
   static int ocfs2_orphan_filldir(void *priv, const char *name, int name_len,
 @@ -1957,6 +1960,9 @@ static int ocfs2_orphan_filldir(void *priv, const char 
 *name, int name_len,
  struct ocfs2_orphan_filldir_priv *p = priv;
  struct inode *iter;
 
 +if (p-for_live)
 +BUG_ON(p-count  ORPHAN_SCAN_ITEMS_PER_LOOP);
 +
  if (name_len == 1  !strncmp(., name, 1))
  return 0;
  if (name_len == 2  !strncmp(.., name, 2))
 @@ -1974,20 +1980,38 @@ static int ocfs2_orphan_filldir(void *priv, const 
 char *name, int name_len,
  OCFS2_I(iter)-ip_next_orphan = p-head;
  p-head = iter;
 
 +/*
 + * we fill at most ORPHAN_SCAN_ITEMS_PER_LOOP direntry in a loop with
 + * the orphan_dir.i_mutex locked.
 + */
 +if (p-for_live) {
 +if (++p-count == ORPHAN_SCAN_ITEMS_PER_LOOP)
 +return -EAGAIN;
 +}
  return 0;
   }
 
   static int ocfs2_queue_orphans(struct ocfs2_super *osb,
 int slot,
 -   struct inode **head)
 +   struct inode **head,
 +   loff_t *pos)
   {
  int status;
  struct inode *orphan_dir_inode = NULL;
  struct ocfs2_orphan_filldir_priv priv;
 -loff_t pos = 0;
 +unsigned int node_num;
 
  priv.osb = osb;
  priv.head = *head;
 +priv.count = 0;
 +
 +spin_lock(osb-osb_lock);
 +status = ocfs2_slot_to_node_num_locked(osb, slot,node_num);
 +spin_unlock(osb-osb_lock);
 +if (!status  node_num == osb-node_num)
 +priv.for_live = true;
 +else
 +priv.for_live = false;
 
 You cannot return (save) pos and yet lookup the slotnum again. That's
 not the right order. Too hackish. All these functions will need to be
 re-prototyped if your approach is what we use.

Yes, I also think it's hackish.
But I don't have a better fix so far. While though it's hackish, it works.
It is just like ocfs2_readdir(). If a new direntry is added before the
position, we just leave there for next run; If one is added after the
possition, we take it; If a diretry is removed, I means the file/inode
is freed, OK; It's safe.
If we restart from offset 0, in case all the inodes in the first 200 or
whatever other limit are not freed and won't be freed(say still opened),
we are in bad loops. 

 
 
  orphan_dir_inode = ocfs2_get_system_file_inode(osb,
 ORPHAN_DIR_SYSTEM_INODE,
 @@ -2005,9 +2029,9 @@ static int ocfs2_queue_orphans(struct ocfs2_super *osb,
  goto out;
  }
 
 -status = ocfs2_dir_foreach(orphan_dir_inode,pos,priv,
 +status = ocfs2_dir_foreach(orphan_dir_inode, pos,priv,
 ocfs2_orphan_filldir);
 -if (status) {
 +if (status  status != -EAGAIN) {
  mlog_errno(status);
  goto out_cluster;
  }
 @@ -2083,16 +2107,18 @@ static int ocfs2_recover_orphans(struct ocfs2_super 
 *osb,
  struct inode *inode = NULL;
  struct inode *iter;
  struct ocfs2_inode_info *oi;
 +loff_t pos = 0;
 
  trace_ocfs2_recover_orphans(slot);
 
 +cnt_scan:
  ocfs2_mark_recovering_orphan_dir(osb, slot);
 -ret = ocfs2_queue_orphans(osb, slot,inode);
 +ret = ocfs2_queue_orphans(osb, slot,inode,pos);
  ocfs2_clear_recovering_orphan_dir(osb, slot);
 
 The slot is being marked and cleared multiple times. I am unsure what the
 ramifications are, but it does not look right.

I 

Re: [Ocfs2-devel] [PATCH 2/3] ocfs2: let ocfs2_dir_foreach return the error turned by filldir

2011-07-07 Thread Wengang Wang
On 11-07-08 10:07, Wengang Wang wrote:
 On 11-07-07 13:09, Sunil Mushran wrote:
  On 07/05/2011 09:40 PM, Wengang Wang wrote:
  Let ocfs2_dir_foreach return the error turned by filldir.
  
  Signed-off-by: Wengang Wangwen.gang.w...@oracle.com
  ---
fs/ocfs2/dir.c |2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
  
  diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
  index 8582e3f..6d7560a 100644
  --- a/fs/ocfs2/dir.c
  +++ b/fs/ocfs2/dir.c
  @@ -2005,6 +2005,8 @@ int ocfs2_dir_foreach(struct inode *inode, loff_t 
  *f_pos, void *priv,
  
 if (ret  0)
 ret = -EIO;
  +  if (!ret)
  +  ret = filldir_err;
  
 return 0;

I meant return ret here.

}
  
  That's not enough. Shouldn't it be return ret too.
 
 If neccesary, We can return both ret and filldir_err.
  
  Have you triggered this error manually? I have never seen reports of it
  before.
 
 No error here. This change is only needed for the following patch:
 [PATCH 3/3] ocfs2: drop and retake orphan_dir.i_mutex in 
 ocfs2_recover_orphans.
 
 thanks,
 wengang.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] ocfs2: drop and retake orphan_dir.i_mutex in ocfs2_recover_orphans

2011-07-07 Thread Wengang Wang
On 11-07-08 10:28, Wengang Wang wrote:
 On 11-07-07 14:48, Sunil Mushran wrote:
  This is too hacky for mainline. May work as a temp fix only.

snip

 trace_ocfs2_recover_orphans(slot);
  
  +cnt_scan:
 ocfs2_mark_recovering_orphan_dir(osb, slot);
  -  ret = ocfs2_queue_orphans(osb, slot,inode);
  +  ret = ocfs2_queue_orphans(osb, slot,inode,pos);
 ocfs2_clear_recovering_orphan_dir(osb, slot);
  
  The slot is being marked and cleared multiple times. I am unsure what the
  ramifications are, but it does not look right.
 
 I will look at this more...

Per my understand, ocfs2_mark_recovering_orphan_dir() is used to avoid
dead lock only.

ABBA lock.
A: inode lock(cluster lock) on inode to delete
B: inode lock on orphan_dir inode

the ocfs2_delete_inode() path takes in AB.
the ocfs2_recover_orphans() path takes in BA on a different node.

Since AB are both required and then released within ocfs2_queue_orphans(), 
calling
ocfs2_mark_recovering_orphan_dir() multiple times is not a problem.

Opition?

thanks,
wengang.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel