Sam,
>
> I think we try to keep the kmod simple so that its only forwarding
> requests to the daemon. Caching fits better in the daemon if that's
> the case.
Sure.. Okay.
> I'm confused now. Why do we need a dentry cache timeout?
i.e. only if we wish to take advantage of the kernel provided dcache.
Right now, it is as if the timeout is 0, i..e hits in the dcache is
treated like a miss
for all practical purposes since we do a full blown revalidate.
thanks,
Murali
>
> >
> > Basically, we would have to a check if dentry is valid based on
> > timeout and if expired
> > we can then return a 0..
> >
> >>
> >>> The downside is some strange errors such as the one that Emmanuel
> >>> is seeing with the NFS server workload..
> >>
> >> I think that's probably something else. His error is specific to
> >> readdir being done in chunks of 26, not with dentry revalidate/
> >> lookups, right?
> >
> > It is quite possible that this is what is causing his error. I did not
> > try it out yet..
> > Have you looked at it or shall I take a stab?
>
> I think this bug might be on the server, so if you want to look into
> it keep that in mind.
> -sam
>
>
> >
> > thanks,
> > Murali
> >
> >> -sam
> >>
> >>
> >>>
> >>> Thanks
> >>> Murali
> >>>
> >>>
> >>> On Dec 5, 2007 12:35 PM, Sam Lang <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>> Hi Murali,
> >>>>
> >>>> I'm trying to figure out a bug in pvfs_revalidate_common. My
> >>>> understanding is that the revalidate code tries to handle the cases
> >>>> where a dentry might not be valid by doing a PVFS lookup operation,
> >>>> and comparing the results with the handle specified in the
> >>>> inode. So
> >>>> except for a couple edge cases (the root dir), we have to (or
> >>>> should
> >>>> be doing) a PVFS lookup and inode/handle comparison when
> >>>> d_revalidate
> >>>> is called by the VFS. Is this an accurate view of what's going on?
> >>>>
> >>>> Also, while it might be slightly more optimal to do the PVFS lookup
> >>>> in
> >>>> the revalidate, since it requires a network operation, my guess is
> >>>> its
> >>>> not much of one. Why not just return 0 from revalidate all the
> >>>> time
> >>>> (indicating an invalid dentry), in which case, the VFS destroys the
> >>>> dentry and creates a new one by doing the lookup itself. This
> >>>> leaves
> >>>> the d_revalidate code fairly simple, and it doesn't seem like we're
> >>>> able to optimize-out the expensive lookup for most (all?) dentries
> >>>> anyway...
> >>>>
> >>>> I've attached a patch that does more or less what I've just
> >>>> described. It seems to fix the errors I was seeing. You can
> >>>> reproduce it by doing:
> >>>>
> >>>> If I do this:
> >>>>
> >>>> nodea> touch f1
> >>>> nodeb> rm f1
> >>>> nodeb> touch f1
> >>>> nodea> ls -lrt f1
> >>>>
> >>>> The result is either an ENOENT error for the file, or in some
> >>>> cases, a
> >>>> bug in the kernel (I've included the trace below). Honestly, I
> >>>> didn't
> >>>> dig into this bug too much -- it looks like the new inode given to
> >>>> d_splice_alias is corrupted in some way -- but I think there
> >>>> might be
> >>>> a simpler fix, so I'll wait to hear from you on the above before
> >>>> digging further.
> >>>>
> >>>> Thanks,
> >>>> -sam
> >>>>
> >>>>
> >>>>
> >>>> Index: src/kernel/linux-2.6/dcache.c
> >>>> ===================================================================
> >>>> RCS file: /projects/cvsroot/pvfs2/src/kernel/linux-2.6/dcache.c,v
> >>>> retrieving revision 1.32
> >>>> diff -u -a -p -r1.32 dcache.c
> >>>> --- src/kernel/linux-2.6/dcache.c 9 Oct 2007 00:05:39
> >>>> -0000 1.32
> >>>> +++ src/kernel/linux-2.6/dcache.c 5 Dec 2007 20:20:08 -0000
> >>>> @@ -36,7 +36,7 @@ static int pvfs2_d_revalidate_common(str
> >>>> }
> >>>>
> >>>> if (inode == NULL) {
> >>>> - return 1;
> >>>> + return 0;
> >>>> }
> >>>> if (inode && parent_inode)
> >>>> {
> >>>> @@ -49,6 +49,7 @@ static int pvfs2_d_revalidate_common(str
> >>>> if (!is_root_handle(inode))
> >>>> {
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG,
> >>>> "pvfs2_d_revalidate_common: attempting lookup.\n");
> >>>> + return 0;
> >>>> new_op = op_alloc(PVFS2_VFS_OP_LOOKUP);
> >>>> if (!new_op)
> >>>> {
> >>>> @@ -110,6 +111,7 @@ static int pvfs2_d_revalidate_common(str
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG,
> >>>> "pvfs2_d_revalidate_common: root handle, lookup skipped.\n");
> >>>> }
> >>>>
> >>>> + return 0;
> >>>> /* now perform revalidation */
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG, " (inode %llu)\n",
> >>>> llu(get_handle_from_ino(inode)));
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 154457.551332] ------------[ cut here ]------------
> >>>> [154457.560704] kernel BUG at fs/dcache.c:952!
> >>>> [154457.569044] invalid opcode: 0000 [1] SMP
> >>>> [154457.577277] CPU 3
> >>>> [154457.581498] Modules linked in: pvfs2 raid0 nfs lockd sunrpc
> >>>> ipv6
> >>>> sony_acpi pcc
> >>>> _acpi dev_acpi tc1100_wmi video sbs i2c_ec dock container button
> >>>> battery ac asus_a
> >>>> cpi backlight xfs sbp2 lp af_packet snd_hda_intel snd_hda_codec
> >>>> snd_pcm_oss snd_mi
> >>>> xer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi
> >>>> snd_seq_midi_ev
> >>>> ent snd_seq serio_raw snd_timer psmouse snd_seq_device ib_ipath
> >>>> ib_core pcspkr par
> >>>> port_pc parport k8temp snd soundcore snd_page_alloc i2c_nforce2
> >>>> i2c_core shpchp pc
> >>>> i_hotplug tsdev evdev ext3 jbd mbcache sg sd_mod ata_generic
> >>>> ohci1394
> >>>> sata_nv amd7
> >>>> 4xx ohci_hcd libata ehci_hcd ieee1394 scsi_mod tg3 usbcore generic
> >>>> raid456 xor rai
> >>>> d1 md_mod thermal processor fan fbcon tileblit font bitblit
> >>>> softcursor
> >>>> vesafb cfbc
> >>>> opyarea cfbimgblt cfbfillrect capability commoncap
> >>>> [154457.721315] Pid: 32045, comm: ls Not tainted 2.6.20-16-
> >>>> generic #2
> >>>> [154457.733629] RIP: 0010:[<ffffffff8023cd74>]
> >>>> [<ffffffff8023cd74>]
> >>>> d_instantiate
> >>>> +0x14/0x90
> >>>> [154457.749962] RSP: 0018:ffff8100740b3bb8 EFLAGS: 00010216
> >>>> [154457.760721] RAX: 0000000000008000 RBX: ffff81011f97ec28 RCX:
> >>>> 0000000000000036
> >>>> [154457.775115] RDX: 0000000000000003 RSI: ffff81011f97ec28 RDI:
> >>>> ffff8101216acb10
> >>>> [154457.789507] RBP: ffff8101216acb80 R08: ffff8100740b2000 R09:
> >>>> 0000000000000000
> >>>> [154457.803901] R10: 000000000000000a R11: 0000000000000202 R12:
> >>>> ffff8101216acb10
> >>>> [154457.818293] R13: ffff81011f9520f8 R14: ffff81007462a740 R15:
> >>>> 0000000000000000
> >>>> [154457.832688] FS: 00002aab82463b00(0000)
> >>>> GS:ffff810121e60740(0000)
> >>>> knlGS:000000
> >>>> 0000000000
> >>>> [154457.848983] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>>> [154457.860607] CR2: 0000000000404131 CR3: 00000001229d6000 CR4:
> >>>> 00000000000006e0
> >>>> [154457.875002] Process ls (pid: 32045, threadinfo
> >>>> ffff8100740b2000,
> >>>> task ffff8100
> >>>> 7dea00c0)
> >>>> [154457.891124] Stack: ffff81011f97ec28 0000000000000000
> >>>> ffff8101216acb10 fffffff
> >>>> f8023215d
> >>>> [154457.907419] ffff8101216acb10 ffff81007462abe8 ffff81011f97ec28
> >>>> ffffffff8850f7
> >>>> 06
> >>>> [154457.922471] ffff8101216acb10 ffff8100740b3ca8 ffff8100740b3e48
> >>>> ffff8100740b3c
> >>>> a8
> >>>> [154457.937140] Call Trace:
> >>>>
> >>>> [154457.942578] [<ffffffff8023215d>] d_splice_alias+0x11d/0x140
> >>>> [154457.954037] [<ffffffff8850f706>] :pvfs2:pvfs2_d_revalidate
> >>>> +0x206/0x300
> >>>> [154457.967403] [<ffffffff8020ca58>] do_lookup+0x198/0x210
> >>>> [154457.977990] [<ffffffff802099cb>] __link_path_walk+0x90b/0xdc0
> >>>> [154457.989795] [<ffffffff8020e6db>] link_path_walk+0x5b/0xf0
> >>>> [154458.000900] [<ffffffff8020b37e>] touch_atime+0xde/0x130
> >>>> [154458.011657] [<ffffffff8020c770>] do_path_lookup+0x1b0/0x1e0
> >>>> [154458.023104] [<ffffffff802120d7>] getname+0x167/0x1d0
> >>>> [154458.033346] [<ffffffff802248fb>] __user_walk_fd+0x4b/0x80
> >>>> [154458.044454] [<ffffffff80241e5c>] vfs_lstat_fd+0x2c/0x70
> >>>> [154458.055220] [<ffffffff8020b37e>] touch_atime+0xde/0x130
> >>>> [154458.065973] [<ffffffff8022c027>] sys_newlstat+0x27/0x50
> >>>> [154458.076738] [<ffffffff8026806d>] error_exit+0x0/0x84
> >>>> [154458.086977] [<ffffffff8026111e>] system_call+0x7e/0x83
> >>>> [154458.097567]
> >>>> [154458.100708]
> >>>> [154458.100709] Code: 0f 0b eb fe 48 c7 c7 00 f2 55 80 e8 9c ac
> >>>> 02 00
> >>>> 48 85 db 74
> >>>> [154458.119148] RIP [<ffffffff8023cd74>] d_instantiate+0x14/0x90
> >>>> [154458.130808] RSP <ffff8100740b3bb8>
> >>>> [154458.138115]
> >>>>
> >>>> RCS file: /projects/cvsroot/pvfs2/src/kernel/linux-2.6/dcache.c,v
> >>>> retrieving revision 1.32
> >>>> diff -u -a -p -r1.32 dcache.c
> >>>> --- src/kernel/linux-2.6/dcache.c 9 Oct 2007 00:05:39
> >>>> -0000 1.32
> >>>> +++ src/kernel/linux-2.6/dcache.c 5 Dec 2007 19:59:26 -0000
> >>>> @@ -49,6 +49,7 @@ static int pvfs2_d_revalidate_common(str
> >>>> if (!is_root_handle(inode))
> >>>> {
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG,
> >>>> "pvfs2_d_revalidate_common: attempting lookup.\n");
> >>>> + return 0;
> >>>> new_op = op_alloc(PVFS2_VFS_OP_LOOKUP);
> >>>> if (!new_op)
> >>>> {
> >>>> @@ -110,6 +111,7 @@ static int pvfs2_d_revalidate_common(str
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG,
> >>>> "pvfs2_d_revalidate_common: root handle, lookup skipped.\n");
> >>>> }
> >>>>
> >>>> + return 0;
> >>>> /* now perform revalidation */
> >>>> gossip_debug(GOSSIP_DCACHE_DEBUG, " (inode %llu)\n",
> >>>> llu(get_handle_from_ino(inode)));
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers