Hi Al,
Ion and I worked on updating our stackable templates for 2.3.99-pre1 for the
past few days. We have found various oddities and other possible problems.
We promised to report on anything interesting we find wrt the VFS, so here
it is. We are willing to test and submit patches for anything below that
you think is worth it.
(1) Asymmetry b/t double_lock and double_unlock:
Only double_unlock does dput() on the two dentries. The only place where
double_lock is called is in do_rename, and do_rename already calls
get_parent() which increments the reference counts. We can simplify the
code and make it symmetric by moving the two get_parent() calls into
double_lock().
(2) vfs_readlink:
It would be nice if all vfs_<op> were essentially wrappers that did some
checking and then called the file system specific method. This isn't the
case for vfs_readlink. (BTW, we like the vfs_* routines very much!)
(3) "__" routines:
In fs/namei.c, vfs_follow_link simply calls __vfs_follow_link with the same,
unchanged args. Can't we simplify and get rid of the __vfs_follow_link
routine? Then at least in page_follow_link, it should call the
vfs_follow_link directly.
(4) permission:
fs/namei.c:permission() should probably be renamed to vfs_permission, b/c
it is a generic VFS routine (and we make direct use of it in lofs).
BTW, with stacking, "permission" gets called O(n^2) times in total. I'm not
sure there's anything that can be done about it now, but it's something to
keep in mind. Here's the call sequence recursive scenario when we have lofs
mounted on, say, ext2 (just one stack level):
vfs_create
may_create
permission
lofs_permission
permission
ext2_permission
lofs_create
vfs_create
may_create
permission
ext2_permission
This happens b/c we use the nicer/newer vfs_<op> routines. However, since
permission() is also called from places other than vfs_<op> routines, we
must define permission in lofs, and thus it gets called recursively.
We thought we could solve the problem by not defining our own permission
method, b/c the real routines (mkdir, create, etc) will call permission on
the lower f/s via the vfs_<op> routines, but we couldn't do it b/c
permission() is called explicitly in open_namei().
One possible solution is creating a vfs_open() routine which will do most of
the checks in filp_open, including permission(), but will take a dentry and
not a filename. Then filp_open can call vfs_open, and so could we; right
now we have to duplicate most of the filp_open code in our ->open function.
This would also nicely solve the recursive permission problem, as well a
cleanup filp_open().
(5) llseek:
In fs/read_write.c, llseek should probably be renamed vfs_llseek, and the
un/lock_kernel that it calls should be moved to sys_llseek. Then vfs_llseek
should be exported so we can use it.
(6) vfs_readdir:
vfs_readdir doesn't take the same prototype list as <foofs>_readdir, which
can be *very* confusing since all other vfs_<op> use the same prototype. I
suggest you make the two the same: swap the "dirent" and "filldir" args in
vfs_readdir() so they're the same everywhere. We've had some amusing (read:
nasty :) kernel panics b/c of that.
Cheers,
Erez & Ion.