Sam, Pete,
Would you guys be interested in trying out the attached patch?
Note: It might crash your machine :(
I think this should fix the problem.. let me know if it doesn't
thanks,
Murali
On 8/15/07, Sam Lang <[EMAIL PROTECTED]> wrote:
>
> On Aug 15, 2007, at 7:00 PM, Murali Vilayannur wrote:
>
> > Oops: Forget to do a reply-to-all..
> >
> > Pete,
> > Weird..
> > d_revalidate is the FS callback to tell the VFS that the dentry is no
> > longer valid
> > and in this case it looks like it rightfully returns 0 indicating that
> > the entry is longer to be trusted.
> >
> > int status = dentry->d_op->d_revalidate(dentry, nd);
> > if (unlikely(status <= 0)) {
> > /*
> > * The dentry failed validation.
> > * If d_revalidate returned 0 attempt to invalidate
> > * the dentry otherwise d_revalidate is asking us
> > * to return a fail status.
> > */
> > if (!status) {
> > if (!d_invalidate(dentry)) { <-- will try to
> > invalidate the dentry
> > dput(dentry); <-- drop the reference
> > dentry = NULL; <-- set it to NULL
> > }
> > } else {
> > dput(dentry);
> > dentry = ERR_PTR(status);
> > }
> > return dentry <-- gets returned.
> >
> > This percolates upwards
> >
> > dentry = cached_lookup(base, name, nd);
> > if (!dentry) {
> > struct dentry *new = d_alloc(base, name); <-- would
> > have allocated a new dentry
> > dentry = ERR_PTR(-ENOMEM);
> > if (!new)
> > goto out;
> > dentry = inode->i_op->lookup(inode, new, nd); <--
> > callsinto our lookup
> > if (!dentry)
> > dentry = new;
> > else
> > dput(new);
> > }
> >
> > I am confused on why we are doing a getattr on the stale handle unless
> > the dentry was somehow not being invalidated i.e. d_invalidate()
> > failed perhaps?
> > In any case d_drop() is the API to drop the dentry which is what
> > d_invalidate() should have invoked..
> > btw, Were there are any files under the "foo" directory?
> >
> > Can you bump up the loglevels (include lookup GOSSIP_NAME_DEBUG,
> > getattr GOSSIP_GETATTR_DEBUG) and send it to me?
> > I will run this workload tonight and see if I can repro this..
> > This needs more than 1 machine, right?
>
> Yep. When Kevin discovered this problem, I think it was with two
> client nodes and four MPI processes. I think Pete was using the same
> setup. This shouldn't need more than one metadata server to reproduce.
>
> -sam
>
>
> > thanks,
> > Murali
> >
> > On 8/15/07, Pete Wyckoff <[EMAIL PROTECTED]> wrote:
> >> Sam and I have been tracking down a pvfs bug when using the VFS
> >> interface. Kevin discovered it.
> >>
> >> The code is test #7 in simul. It runs in parallel, four tasks,
> >> tasks 0 and 1 on node1 and tasks 2 and 3 on node2. It does:
> >>
> >> if (task == 0)
> >> mkdir("foo");
> >>
> >> MPI_Barrier();
> >> sleep(3);
> >>
> >> stat("foo");
> >>
> >> if (task == 0)
> >> rmdir("foo");
> >>
> >> On a freshly initialized pvfs (1 server for both md + io), it works.
> >> Task 0 creates the directory, and all four tasks stat it
> >> successfully. When the process exits, the directory is indeed gone.
> >>
> >> The second time you run it, tasks 2 and 3 (on node2) get -ENOENT
> >> from the stat, but tasks 0 and 1 work fine as before and the
> >> directory was indeed created properly.
> >>
> >> Looking down a bit further, the server sees lookup requests from
> >> tasks 2 and 3, and returns the proper handle Id. Then it sees
> >> getattr requests from tasks 2 and 3 for the handle ID that the
> >> directory had on the first run, not the handle ID for this run.
> >>
> >> We may have traced this to the kernel module. Some of the log looks
> >> like this:
> >>
> >> pvfs2_d_revalidate_common: called on dentry ffff81003df86970.
> >> pvfs2_d_revalidate_common: parent found.
> >> pvfs2_d_revalidate_common: attempting lookup.
> >> Alloced OP (ffff81003d98a1f8: 121 OP_LOOKUP)
> >> pvfs2: service_operation: pvfs2_lookup ffff81003d98a1f8
> >> client-core: reading op tag 120 OP_LOOKUP
> >> client-core: reading op tag 121 OP_LOOKUP
> >> (get) Alloced OP (ffff81003df561b8:120)
> >> (get) Alloced OP (ffff81003d98a1f8:121)
> >> pvfs2: service_operation pvfs2_lookup returning: 0 for
> >> ffff81003df561b8.
> >> pvfs2_d_revalidate_common: lookup failure or no match.
> >> Releasing OP (ffff81003df561b8: 120)
> >> pvfs2_getattr: called on simul_dir_stat.0
> >> pvfs2_inode_getattr: called on inode 1048471
> >>
> >> Something calls revalidate on the dentry. The lookup returns
> >> successful from userspace. The kernel sees that the handles are
> >> different:
> >>
> >> if((new_op->downcall.status != 0) ||
> >> !match_handle(new_op-
> >> >downcall.resp.lookup.refn.handle, inode))
> >> {
> >> gossip_debug(GOSSIP_DCACHE_DEBUG,
> >> "pvfs2_d_revalidate_common: lookup failure or no match.\n");
> >> op_release(new_op);
> >> return(0);
> >> }
> >>
> >> But then immediatly issues a getattr for the old handle ID. Anybody
> >> know how to fix or destroy the bad dentry? (Looks at Murali...)
> >>
> >> -- Pete
> >>
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >>
> > _______________________________________________
> > Pvfs2-developers mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
>
>
Index: src/kernel/linux-2.6/dcache.c
===================================================================
RCS file: /projects/cvsroot/pvfs2-1/src/kernel/linux-2.6/dcache.c,v
retrieving revision 1.31
diff -u -r1.31 dcache.c
--- src/kernel/linux-2.6/dcache.c 20 Sep 2006 22:59:53 -0000 1.31
+++ src/kernel/linux-2.6/dcache.c 16 Aug 2007 09:33:24 -0000
@@ -35,8 +35,14 @@
gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: parent not found.\n");
}
+ if (inode == NULL) {
+ return 1;
+ }
if (inode && parent_inode)
{
+ if (is_bad_inode(inode)) {
+ return 0;
+ }
/* first perform a lookup to make sure that the object not only
* exists, but is still in the expected place in the name space
*/
@@ -73,14 +79,30 @@
new_op, "pvfs2_lookup",
get_interruptible_flag(parent_inode));
- if((new_op->downcall.status != 0) ||
- !match_handle(new_op->downcall.resp.lookup.refn.handle, inode))
- {
- gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure or no match.\n");
+ if ((new_op->downcall.status != 0)) {
+ gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure.\n");
op_release(new_op);
- return(0);
+ return 0;
+ }
+ /* no match */
+ if (!match_handle(new_op->downcall.resp.lookup.refn.handle, inode))
+ {
+ struct inode *res_inode;
+
+ /* Find or allocate a new inode for this new handle */
+ res_inode = pvfs2_iget(parent_inode->i_sb, &new_op->downcall.resp.lookup.refn);
+ if (res_inode != NULL) {
+ /* associate with the given dentry */
+ d_add(dentry, res_inode);
+ gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: associated inode %llu)\n",
+ llu(get_handle_from_ino(res_inode)));
+ inode = res_inode;
+ } else {
+ gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure or no match.\n");
+ op_release(new_op);
+ return(0);
+ }
}
-
op_release(new_op);
}
else
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers