On Mon, 2 Jun 2008, Garance A Drosihn wrote:

I remember a discussion of changes to MacOS10 in Leopard which made it easier to implement features such as Spotlight and TimeMachine. The description starts here, I think:

http://arstechnica.com/reviews/os/mac-os-x-10-5.ars/7

the section on file-system events.

The idea I thought was interesting was to save the metadata on a directory basis, instead of saving it on the file. So, if file /some/dir/fname was changed, then they'd record that *some* file under /some/dir has changed.

So when your userland process comes along later on, it still has to scan all files in that directory to see which file(s) actually changed. But that's a lot less work than scanning all files in the filesystem, and it also means there is much less data that has to be kept track of.

I have no idea how easy it would be to implement something similar on FreeBSD, but the strategy seemed like a pretty neat idea.

fsevents allows user processes to subscribe, effectively on a per-filesystem basis, to namespace and file close operations. The implementation is split into two parts: a kernel component, which captures events with possible coalescing, and a user daemon, fseventsd, which listens on a special device and then provides scope narrowing and persistence for subscriptions. Applications talk to fseventsd, using Mach ports, I believe, and fseventsd is responsible for tracking subscriptions, filtering events, and so on.

I'm aware of several limitations that should be considered very carefully before adopting this code:

(1) The user<->kernel interface is essentially a firehose, and available only
    to privileged processes.  fseventsd performs checks in user space to see
    whether each consumer is allowed access to each event, which can lead to
    confusing and potentially quite incorrect results.

(2) The kernel code requires a reliable conversion from vnode to path, which
    we don't have, as events are with respect to paths, and especially
    coalescing.

(3) The user daemon requires synchronous hooks into the file system umount
    event because fseventsd stores its events journal in the file system root,
    so must first close it before the file system can be unmounted.  In Mac OS
    X, this is satisfied by having the disk arbitration daemon, which performs
    unmounts, first send a message to fseventsd and wait for it to finish up.
    I've seen a number of occasions where the disk unmount process has become
    non-trivially stalled due to fseventsd, so there's a potential robustness
    question.

(4) As I understand it, events frequently come down to "file system X
    changed" in practice, which could be captured by a far simpler mechanism.
    I've not done any measurements to confirm whether this is the case, but
    it's not impossible to imagine on a busy system.

I think there's also considerable overlap with other kernel event systems, such as audit, and we might benefit from thinking seriously about enhancing those event systems rather than introducing a new one. The design of fsevents is pretty much entirely dictated by the needs of Spotlight and later Time Machine. In particular, it's not clear to me that the persistency requirements, which are a large part of the fsevents design, are important to us... or are they?

Robert N M Watson
Computer Laboratory
University of Cambridge

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to