On Thu, Aug 25, 2005 at 03:09:21PM -0700, Matthew Dillon wrote: > The entire directory tree does not need to be in memory, only the > pieces that lead to (cached) vnodes. DragonFly's namecache subsystem > is able to guarentee this.
*How* can it guaranty that without reading the whole directory tree in memory first? Unix filesystems have no way to determine in which directories an inode is linked from. If you have /dir1/link1 and /dir2/dir3/link2 as hardlinks for the same inode, you can't correctly update the FSMID for dir2 without having read dir3 first, simply because no name cache entry exists. > :On a running system, it is enough to either get notification when a > :certain vnode changed (kqueue modell) or when a vnode changed (imon / > :dnotify model). Trying to detect in-flight changes is *not* utterly > :trivial for any model, since even accurate atime is already difficult to > :achieve for mmaped files. Believing that you can *reliable* backup a > :system based on VOP transactions alone is therefore a dream. > > This is not correct. It is certainly NOT enough to just be told > when an inode changes.... you need to know where in the namespace > the change occured and you need to know how the change(s) effect > the namespace. Just knowing that a file with inode BLAH has been > modified is not nearly enough information. The point is that the application can determine in which inodes it is interested in and reread e.g. a directory when it has changed. There are some edge cases which might be hard to handle without additional information (e.g. when a link is moved outside the currently supervised area and you want to continue it's supervision. That's an entirely different question though. > Detecting in-flight changes is trivial. You check the FSMID before > descending into a directory or file, and you check it after you ascend > back out of it. If it has changed, you know that something changed > while you were processing the directory or file and you simply re-recurse > down and rescan just the bits that now have different FSMID's. But it is also very limited because it doesn't allow any filtering on what is interesting. In the worst case you just update all the FSMIDs for nothing. It also means as long as there is no way to store them persistenly that you can't free namecache entries without having to deal with exactly those cases in applications. Storing them persistenly has to deal with unrecorded changes which wouldn't be detected. Just think about dual-booting to FreeBSD. > For example, softupdates right now is not able to guarentee data > consistency. If you crash while writing something out then on reboot > you can wind up with some data blocks full of zero's, or full of old > data, while other data blocks contain new data. That's not so much a problem of softupdates, but of any filesystem without very strong data journaling. ZFS is said to do something in that area, but it can't really solve interactions which cross filesystems. The very same problem exists for FSMIDs. This is something where a transactional database and a normal filesystem differ: filesystems almost never have full write-ahead log files, because it makes them awefully slow. The most important reason is that applications have no means to specify explicit transaction borders, so you have to assume an autocommit style usage always. Joerg
