Adam Megacz wrote:
After reading through the ZFS slides, it appears to be the case that if ZFS wants to modify a single data block, if must rewrite every block between that modified block and the uberblock (root of the tree).
> Is this really the case?
That is true when commiting the transaction grouptp the main pool every 5 seconds. However, this isn't so bad as a lot of transactions are commited which likely have common roots and writes are aggregated and striped across the pool etc...
If so, does this mean that every commit operation (ie every fsync()) in ZFS requires O(log n) platter writes?
The ZIL does not modify the main pool. It only writes system call transactions related to the file being fsynced and any other transactions that might related to that file (eg mkdir, rename). Writes for these transactions are also aggregated and written use a block size tailored to fit the data. Typically for a single system call just one write occurs. On a system crash or power fail those ZIL transactions are replayed. See also: http://blogs.sun.com/perrin Neil.
Thanks, - a
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss