On Dec 7, 2007, at 1:04 PM, Murali Vilayannur wrote:

Hi Sam,

I can test it on older kernels. :-)

Okay.. sounds good!

We do this with the ncache in the client daemon.  Sure, it still
requires invalidating an entry and doing the lookup through the VFS to
the client daemon, but that seems tiny by comparison to the network
roundtrip.

RIght.. But don't do it in kmod for some reason I can't remember.

I think we try to keep the kmod simple so that its only forwarding requests to the daemon. Caching fits better in the daemon if that's the case.


If we are going to go with the dentry cache timeout, then your patch will need
some modifications..

I'm confused now.  Why do we need a dentry cache timeout?


Basically, we would have to a check if dentry is valid based on
timeout and if expired
we can then return a 0..


The downside is some strange errors such as the one that Emmanuel
is seeing with the NFS server workload..

I think that's probably something else.  His error is specific to
readdir being done in chunks of 26, not with dentry revalidate/
lookups, right?

It is quite possible that this is what is causing his error. I did not
try it out yet..
Have you looked at it or shall I take a stab?

I think this bug might be on the server, so if you want to look into it keep that in mind.
-sam


thanks,
Murali

-sam



Thanks
Murali


On Dec 5, 2007 12:35 PM, Sam Lang <[EMAIL PROTECTED]> wrote:

Hi Murali,

I'm trying to figure out a bug in pvfs_revalidate_common.  My
understanding is that the revalidate code tries to handle the cases
where a dentry might not be valid by doing a PVFS lookup operation,
and comparing the results with the handle specified in the inode. So except for a couple edge cases (the root dir), we have to (or should be doing) a PVFS lookup and inode/handle comparison when d_revalidate
is called by the VFS.  Is this an accurate view of what's going on?

Also, while it might be slightly more optimal to do the PVFS lookup
in
the revalidate, since it requires a network operation, my guess is
its
not much of one. Why not just return 0 from revalidate all the time
(indicating an invalid dentry), in which case, the VFS destroys the
dentry and creates a new one by doing the lookup itself. This leaves
the d_revalidate code fairly simple, and it doesn't seem like we're
able to optimize-out the expensive lookup for most (all?) dentries
anyway...

I've attached a patch that does more or less what I've just
described.  It seems to fix the errors I was seeing.  You can
reproduce it by doing:

 If I do this:

nodea> touch f1
nodeb> rm f1
nodeb> touch f1
nodea> ls -lrt f1

The result is either an ENOENT error for the file, or in some
cases, a
bug in the kernel (I've included the trace below).  Honestly, I
didn't
dig into this bug too much -- it looks like the new inode given to
d_splice_alias is corrupted in some way -- but I think there might be
a simpler fix, so I'll wait to hear from you on the above before
digging further.

Thanks,
-sam



Index: src/kernel/linux-2.6/dcache.c
===================================================================
RCS file: /projects/cvsroot/pvfs2/src/kernel/linux-2.6/dcache.c,v
retrieving revision 1.32
diff -u -a -p -r1.32 dcache.c
--- src/kernel/linux-2.6/dcache.c       9 Oct 2007 00:05:39
-0000       1.32
+++ src/kernel/linux-2.6/dcache.c       5 Dec 2007 20:20:08 -0000
@@ -36,7 +36,7 @@ static int pvfs2_d_revalidate_common(str
    }

    if (inode == NULL) {
-        return 1;
+        return 0;
    }
    if (inode && parent_inode)
    {
@@ -49,6 +49,7 @@ static int pvfs2_d_revalidate_common(str
        if (!is_root_handle(inode))
        {
            gossip_debug(GOSSIP_DCACHE_DEBUG,
"pvfs2_d_revalidate_common: attempting lookup.\n");
+            return 0;
            new_op = op_alloc(PVFS2_VFS_OP_LOOKUP);
            if (!new_op)
            {
@@ -110,6 +111,7 @@ static int pvfs2_d_revalidate_common(str
            gossip_debug(GOSSIP_DCACHE_DEBUG,
"pvfs2_d_revalidate_common: root handle, lookup skipped.\n");
        }

+        return 0;
        /* now perform revalidation */
        gossip_debug(GOSSIP_DCACHE_DEBUG, " (inode %llu)\n",
                    llu(get_handle_from_ino(inode)));




154457.551332] ------------[ cut here ]------------
[154457.560704] kernel BUG at fs/dcache.c:952!
[154457.569044] invalid opcode: 0000 [1] SMP
[154457.577277] CPU 3
[154457.581498] Modules linked in: pvfs2 raid0 nfs lockd sunrpc ipv6
sony_acpi pcc
_acpi dev_acpi tc1100_wmi video sbs i2c_ec dock container button
battery ac asus_a
cpi backlight xfs sbp2 lp af_packet snd_hda_intel snd_hda_codec
snd_pcm_oss snd_mi
xer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_ev
ent snd_seq serio_raw snd_timer psmouse snd_seq_device ib_ipath
ib_core pcspkr par
port_pc parport k8temp snd soundcore snd_page_alloc i2c_nforce2
i2c_core shpchp pc
i_hotplug tsdev evdev ext3 jbd mbcache sg sd_mod ata_generic ohci1394
sata_nv amd7
4xx ohci_hcd libata ehci_hcd ieee1394 scsi_mod tg3 usbcore generic
raid456 xor rai
d1 md_mod thermal processor fan fbcon tileblit font bitblit
softcursor
vesafb cfbc
opyarea cfbimgblt cfbfillrect capability commoncap
[154457.721315] Pid: 32045, comm: ls Not tainted 2.6.20-16- generic #2 [154457.733629] RIP: 0010:[<ffffffff8023cd74>] [<ffffffff8023cd74>]
d_instantiate
+0x14/0x90
[154457.749962] RSP: 0018:ffff8100740b3bb8  EFLAGS: 00010216
[154457.760721] RAX: 0000000000008000 RBX: ffff81011f97ec28 RCX:
0000000000000036
[154457.775115] RDX: 0000000000000003 RSI: ffff81011f97ec28 RDI:
ffff8101216acb10
[154457.789507] RBP: ffff8101216acb80 R08: ffff8100740b2000 R09:
0000000000000000
[154457.803901] R10: 000000000000000a R11: 0000000000000202 R12:
ffff8101216acb10
[154457.818293] R13: ffff81011f9520f8 R14: ffff81007462a740 R15:
0000000000000000
[154457.832688] FS: 00002aab82463b00(0000) GS:ffff810121e60740(0000)
knlGS:000000
0000000000
[154457.848983] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[154457.860607] CR2: 0000000000404131 CR3: 00000001229d6000 CR4:
00000000000006e0
[154457.875002] Process ls (pid: 32045, threadinfo ffff8100740b2000,
task ffff8100
7dea00c0)
[154457.891124] Stack:  ffff81011f97ec28 0000000000000000
ffff8101216acb10 fffffff
f8023215d
[154457.907419]  ffff8101216acb10 ffff81007462abe8 ffff81011f97ec28
ffffffff8850f7
06
[154457.922471]  ffff8101216acb10 ffff8100740b3ca8 ffff8100740b3e48
ffff8100740b3c
a8
[154457.937140] Call Trace:

[154457.942578]  [<ffffffff8023215d>] d_splice_alias+0x11d/0x140
[154457.954037]  [<ffffffff8850f706>] :pvfs2:pvfs2_d_revalidate
+0x206/0x300
[154457.967403]  [<ffffffff8020ca58>] do_lookup+0x198/0x210
[154457.977990]  [<ffffffff802099cb>] __link_path_walk+0x90b/0xdc0
[154457.989795]  [<ffffffff8020e6db>] link_path_walk+0x5b/0xf0
[154458.000900]  [<ffffffff8020b37e>] touch_atime+0xde/0x130
[154458.011657]  [<ffffffff8020c770>] do_path_lookup+0x1b0/0x1e0
[154458.023104]  [<ffffffff802120d7>] getname+0x167/0x1d0
[154458.033346]  [<ffffffff802248fb>] __user_walk_fd+0x4b/0x80
[154458.044454]  [<ffffffff80241e5c>] vfs_lstat_fd+0x2c/0x70
[154458.055220]  [<ffffffff8020b37e>] touch_atime+0xde/0x130
[154458.065973]  [<ffffffff8022c027>] sys_newlstat+0x27/0x50
[154458.076738]  [<ffffffff8026806d>] error_exit+0x0/0x84
[154458.086977]  [<ffffffff8026111e>] system_call+0x7e/0x83
[154458.097567]
[154458.100708]
[154458.100709] Code: 0f 0b eb fe 48 c7 c7 00 f2 55 80 e8 9c ac 02 00
48 85 db 74
[154458.119148] RIP  [<ffffffff8023cd74>] d_instantiate+0x14/0x90
[154458.130808]  RSP <ffff8100740b3bb8>
[154458.138115]

RCS file: /projects/cvsroot/pvfs2/src/kernel/linux-2.6/dcache.c,v
retrieving revision 1.32
diff -u -a -p -r1.32 dcache.c
--- src/kernel/linux-2.6/dcache.c       9 Oct 2007 00:05:39
-0000       1.32
+++ src/kernel/linux-2.6/dcache.c       5 Dec 2007 19:59:26 -0000
@@ -49,6 +49,7 @@ static int pvfs2_d_revalidate_common(str
        if (!is_root_handle(inode))
        {
            gossip_debug(GOSSIP_DCACHE_DEBUG,
"pvfs2_d_revalidate_common: attempting lookup.\n");
+            return 0;
            new_op = op_alloc(PVFS2_VFS_OP_LOOKUP);
            if (!new_op)
            {
@@ -110,6 +111,7 @@ static int pvfs2_d_revalidate_common(str
            gossip_debug(GOSSIP_DCACHE_DEBUG,
"pvfs2_d_revalidate_common: root handle, lookup skipped.\n");
        }

+        return 0;
        /* now perform revalidation */
        gossip_debug(GOSSIP_DCACHE_DEBUG, " (inode %llu)\n",
                    llu(get_handle_from_ino(inode)));








_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to