Sam, Pete,
Would you guys be interested in trying out the attached patch?
Note: It might crash your machine :(
I think this should fix the problem.. let me know if it doesn't
thanks,
Murali

On 8/15/07, Sam Lang <[EMAIL PROTECTED]> wrote:
>
> On Aug 15, 2007, at 7:00 PM, Murali Vilayannur wrote:
>
> > Oops: Forget to do a reply-to-all..
> >
> > Pete,
> > Weird..
> > d_revalidate is the FS callback to tell the VFS that the dentry is no
> > longer valid
> > and in this case it looks like it rightfully returns 0 indicating that
> > the entry is longer to be trusted.
> >
> >        int status = dentry->d_op->d_revalidate(dentry, nd);
> >         if (unlikely(status <= 0)) {
> >                 /*
> >                  * The dentry failed validation.
> >                  * If d_revalidate returned 0 attempt to invalidate
> >                  * the dentry otherwise d_revalidate is asking us
> >                  * to return a fail status.
> >                  */
> >                 if (!status) {
> >                         if (!d_invalidate(dentry)) { <-- will try to
> > invalidate the dentry
> >                                 dput(dentry);  <-- drop the reference
> >                                 dentry = NULL; <-- set it to NULL
> >                         }
> >                 } else {
> >                         dput(dentry);
> >                         dentry = ERR_PTR(status);
> >                 }
> > return dentry <-- gets returned.
> >
> > This percolates upwards
> >
> >         dentry = cached_lookup(base, name, nd);
> >         if (!dentry) {
> >                 struct dentry *new = d_alloc(base, name); <-- would
> > have allocated a new dentry
> >                 dentry = ERR_PTR(-ENOMEM);
> >                 if (!new)
> >                         goto out;
> >                 dentry = inode->i_op->lookup(inode, new, nd); <--
> > callsinto our lookup
> >                 if (!dentry)
> >                         dentry = new;
> >                 else
> >                         dput(new);
> >         }
> >
> > I am confused on why we are doing a getattr on the stale handle unless
> > the dentry was somehow not being invalidated i.e. d_invalidate()
> > failed perhaps?
> > In any case d_drop() is the API to drop the dentry which is what
> > d_invalidate() should have invoked..
> > btw, Were there are any files under the "foo" directory?
> >
> > Can you bump up the loglevels (include lookup GOSSIP_NAME_DEBUG,
> > getattr GOSSIP_GETATTR_DEBUG) and send it to me?
> > I will run this workload tonight and see if I can repro this..
> > This needs more than 1 machine, right?
>
> Yep.  When Kevin discovered this problem, I think it was with two
> client nodes and four MPI processes.  I think Pete was using the same
> setup.  This shouldn't need more than one metadata server to reproduce.
>
> -sam
>
>
> > thanks,
> > Murali
> >
> > On 8/15/07, Pete Wyckoff <[EMAIL PROTECTED]> wrote:
> >> Sam and I have been tracking down a pvfs bug when using the VFS
> >> interface.  Kevin discovered it.
> >>
> >> The code is test #7 in simul.  It runs in parallel, four tasks,
> >> tasks 0 and 1 on node1 and tasks 2 and 3 on node2.  It does:
> >>
> >>     if (task == 0)
> >>         mkdir("foo");
> >>
> >>     MPI_Barrier();
> >>     sleep(3);
> >>
> >>     stat("foo");
> >>
> >>     if (task == 0)
> >>         rmdir("foo");
> >>
> >> On a freshly initialized pvfs (1 server for both md + io), it works.
> >> Task 0 creates the directory, and all four tasks stat it
> >> successfully.  When the process exits, the directory is indeed gone.
> >>
> >> The second time you run it, tasks 2 and 3 (on node2) get -ENOENT
> >> from the stat, but tasks 0 and 1 work fine as before and the
> >> directory was indeed created properly.
> >>
> >> Looking down a bit further, the server sees lookup requests from
> >> tasks 2 and 3, and returns the proper handle Id.  Then it sees
> >> getattr requests from tasks 2 and 3 for the handle ID that the
> >> directory had on the first run, not the handle ID for this run.
> >>
> >> We may have traced this to the kernel module.  Some of the log looks
> >> like this:
> >>
> >>     pvfs2_d_revalidate_common: called on dentry ffff81003df86970.
> >>     pvfs2_d_revalidate_common: parent found.
> >>     pvfs2_d_revalidate_common: attempting lookup.
> >>     Alloced OP (ffff81003d98a1f8: 121 OP_LOOKUP)
> >>     pvfs2: service_operation: pvfs2_lookup ffff81003d98a1f8
> >>     client-core: reading op tag 120 OP_LOOKUP
> >>     client-core: reading op tag 121 OP_LOOKUP
> >>     (get) Alloced OP (ffff81003df561b8:120)
> >>     (get) Alloced OP (ffff81003d98a1f8:121)
> >>     pvfs2: service_operation pvfs2_lookup returning: 0 for
> >>     ffff81003df561b8.
> >>     pvfs2_d_revalidate_common: lookup failure or no match.
> >>     Releasing OP (ffff81003df561b8: 120)
> >>     pvfs2_getattr: called on simul_dir_stat.0
> >>     pvfs2_inode_getattr: called on inode 1048471
> >>
> >> Something calls revalidate on the dentry.  The lookup returns
> >> successful from userspace.  The kernel sees that the handles are
> >> different:
> >>
> >>             if((new_op->downcall.status != 0) ||
> >>                     !match_handle(new_op-
> >> >downcall.resp.lookup.refn.handle, inode))
> >>             {
> >>                 gossip_debug(GOSSIP_DCACHE_DEBUG,
> >> "pvfs2_d_revalidate_common: lookup failure or no match.\n");
> >>                 op_release(new_op);
> >>                 return(0);
> >>             }
> >>
> >> But then immediatly issues a getattr for the old handle ID.  Anybody
> >> know how to fix or destroy the bad dentry?  (Looks at Murali...)
> >>
> >>                 -- Pete
> >>
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >>
> > _______________________________________________
> > Pvfs2-developers mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
>
>
Index: src/kernel/linux-2.6/dcache.c
===================================================================
RCS file: /projects/cvsroot/pvfs2-1/src/kernel/linux-2.6/dcache.c,v
retrieving revision 1.31
diff -u -r1.31 dcache.c
--- src/kernel/linux-2.6/dcache.c	20 Sep 2006 22:59:53 -0000	1.31
+++ src/kernel/linux-2.6/dcache.c	16 Aug 2007 09:33:24 -0000
@@ -35,8 +35,14 @@
         gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: parent not found.\n");
     }
     
+    if (inode == NULL) {
+        return 1;
+    }
     if (inode && parent_inode)
     {
+        if (is_bad_inode(inode)) {
+            return 0;
+        }
         /* first perform a lookup to make sure that the object not only
          * exists, but is still in the expected place in the name space 
          */
@@ -73,14 +79,30 @@
                 new_op, "pvfs2_lookup", 
                 get_interruptible_flag(parent_inode));
 
-            if((new_op->downcall.status != 0) || 
-                    !match_handle(new_op->downcall.resp.lookup.refn.handle, inode))
-            {
-                gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure or no match.\n");
+            if ((new_op->downcall.status != 0)) {
+                gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure.\n");
                 op_release(new_op);
-                return(0);
+                return 0;
+            }
+            /* no match */
+            if (!match_handle(new_op->downcall.resp.lookup.refn.handle, inode))
+            {
+                struct inode *res_inode;
+
+                /* Find or allocate a new inode for this new handle */
+                res_inode = pvfs2_iget(parent_inode->i_sb, &new_op->downcall.resp.lookup.refn);
+                if (res_inode != NULL) {
+                    /* associate with the given dentry */
+                    d_add(dentry, res_inode);
+                    gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: associated inode %llu)\n",
+                                 llu(get_handle_from_ino(res_inode)));
+                    inode = res_inode;
+                } else {
+                    gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure or no match.\n");
+                    op_release(new_op);
+                    return(0);
+                }
             }
-            
             op_release(new_op);
         }
         else
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to