Hi Pete,

Interesting that you mentioned that this worked prior to June. I don't
recall any checkins to this part of the code at all ..oh well.
I wonder why the parent directory entry (i.e. the directory being
opened) did not have a refcount > 0, perhaps we/VFS missed a  dget()
in some path prior to the open...?
I think we may not have tested out this path or ran into this issue before :(
Is the memory on the system flaky or something? unlikely.. but still :)

As you have rightly deduced, we don't need the calls to
dcache_dir_open() (on dir opens).
This would have been needed had we made use of the libfs stuff. If you
look at fs/libfs.c on 2.6,
open() of a directory should create a new child dentry and save it in
->private_data.
Close should dput() that, while readdir() uses that as the cursor to
walk through the things
and so does lseek() (libfs is a librification of most commonly used
APIs that any in-memory file system does not need to reinvent, like
ramfs/relayfs etc).
Since pvfs2 has its own set of readdir/lseek implementations, this
cursor stuff is not needed
even on 2.6 kernels.

Can you retest with that stuff yanked out if you get time?
thanks,
Murali

[7]kdb> bt
Stack traceback for pid 12721
0xe00001b00b190000    12721    12719  1    7   R  0xe00001b00b1905a0
*bash
0xe0000000044ffb90 __out_of_line_bug+0x70
        args (0x103, 0xe0000000045c59f0, 0x40b)
        kernel .text 0xe000000004400000 0xe0000000044ffb20
0xe0000000044ffbc0
0xe0000000045c59f0 d_alloc+0x1f0
        args (0xe00001b00b310b80, 0xe00000000503c2ac,
0xe00000300d632380, 0x0, 0xe00000000503c2a8)
        kernel .text 0xe000000004400000 0xe0000000045c5800
0xe0000000045c5b80
0xe0000000045b9f40 dcache_dir_open+0x40
        args (0xe00001300e858400, 0xe000033007a53c10,
0xa000000009d8f350, 0x38a)        kernel .text 0xe000000004400000
0xe0000000045b9f00 0xe0000000045b9f80
0xa000000009d8f350 [pvfs2]pvfs2_file_open+0x2b0
        args (0xe00001300e858400, 0xe000033007a53b80,
0xa000000009dac7e8, 0xe00000000458af60, 0xa000000009dad7ec)
        pvfs2 .text 0xa000000009d840c0 0xa000000009d8f0a0
0xa000000009d8f3c0
0xe00000000458b180 dentry_open+0x240
        args (0xe00001b00b310b80, 0xe00001b07ba5ef80,
0xe000033007a53c18, 0xe0000000051ff200, 0xe000033007a53b80)
        kernel .text 0xe000000004400000 0xe00000000458af40
0xe00000000458b3c0
0xe00000000458af20 filp_open+0xc0
        args (0xe00000300d1c2000, 0x18800, 0x40000000000e86b0,
0xe00000000458b910, 0x792)
        kernel .text 0xe000000004400000 0xe00000000458ae60
0xe00000000458af40
0xe00000000458b910 sys_open+0xd0
        args (0x60000000000522a0, 0x10800, 0x40000000000e86b0,
0xc000000000000b19, 0x60000000000522c0)
        kernel .text 0xe000000004400000 0xe00000000458b840
0xe00000000458bb40
0xe00000000440e300 ia64_ret_from_syscall
        args (0x60000000000522a0, 0x60000000000522ab,
0x6000000000011308, 0x60000000000522a0, 0x40000000000e7d50)
        kernel .text 0xe000000004400000 0xe00000000440e300
0xe00000000440e320

That hits this code in pvfs2_file_open:

    if (S_ISDIR(inode->i_mode))
    {
        ret = dcache_dir_open(inode, file);
    }
    else
    {
        ...

which then dies in dget() as called by d_alloc() because the refcnt
on file->f_dentry is zero.

I'm not particularly motivated to work on fixing this problem for
such an ancient kernel, that further has patches from SGI on top of
it.  But, pvfs kernel mount used to work on this kernel with the
June 2006 CVS.  I looked through the diff from then, but didn't see
anything obvious in this area.

Nosed around a bit and found that the only other caller of
dcache_dir_open() in the tree is autofs4, which uses it directly in
its _dir_operations struct when it instaniates a directory inode,
not like it is done in pvfs.

In 2.6 kernel, the call is there, and there are now two in-tree
callers: fs/autofs4, and some virtual FS for cell.

I'm a bit confused as to why we need to do anything in the directory
open path; i.e. why is there even a function, and why does that
function call dcache_dir_open()?

If nobody is particularly excited about debugging this, no big deal.
Troy's not too thrilled about crashing the Altix anymore, and maybe
we can pressure the admins to switch to a 2.6 kernel.

                -- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to