On Tue, 13 Jun 2000, Hans Reiser wrote:

> There are desired applications of reiserfs where the VFS inode is just too
> heavyweight.  I'd just like to say that this seems like a good concern you have
> here, and the ReiserFS team is completely willing to recode in 2.5.* to
> accomodate your radical proposal, or some as yet unproposed even better radical
> proposal if it comes along, because this is a real issue.  Perhaps the ultimate
> lightweight inode would simply mean treating the dcache as optional, and the FS
> determining whether to look there for it or sidestep it.

??? inodes are not in dcache. Dentries are, and without them lookups will
kill you + you'll have a dubious pleasure to reproduce all race-preventing
code of VFS in your replacement mechanism. Which is pretty likely to bring
you sizeof(dentry) back, nevermind the several years of bughunting (and
that's precisely the case when testing will not help - race scenarios in
that area are impossible to hit unless you've found them in the code and
are deliberately hunting for that particular race). But hey, if you want
to do that - go ahead, it's your ass, after all.

Making inodes droppable (i.e. letting dentry get rid of in-core inode if
we can rebuild it) actually makes sense, but that's a 2.5 project. We
obviously have to keep inodes that belong to dentries of opened files, we
also need the inode of directory to make any operations other than dcache
lookup (here we'll need a change in ->i_sem/->i_zombie handling and
handling of ->lookup()). But if dentry has no active references (i.e.
dentry->d_count == number of foo such that (foo->d_parent==dentry)) it
is, in principle, fair game for tossing the inode out. The problem being
that semaphores we are using now sit in inodes and we'll need to decide
what gets shifted into dentry and what changes of locking it will
require. I more or less know how to do that (read: no code exists right
now, but there are some blackboard variants), but I know damn well that it
is too heavy for 2.4. Sorry. It would mean at least a couple of months of
_heavy_ review/testing to get into beta stage. And doing that to code that
is involved into all filesystem-related syscalls at the current stage...

> For persons surprised that this is a real issue, let me just mention that there
> are persons desiring to put 30 million entry plus hypertext indexes with poor
> locality of reference into reiserfs as directories, and one issue is that the
> VFS inode costs too much RAM.  For these indexes to be effective one needs to
> use stem compression and other such techniques on them just to be able to
> prevent being killed by random I/Os to disk when the index is too big for RAM.

No surprise here, but if they have dedicated boxen they may want to trim
the unneeded lines from inode->u - it's a union and it has some pretty
large fields. If you don't use foofs - don't include its private inode
into the thing. Not that radical, but may actually reduce the
sizeof(struct inode) about two times (depends on the filesystem mix they
are using, indeed).

Reply via email to