This is the current design document for HAMMER2. It lists every feature
I intend to implement for HAMMER2. Everything except the freemap and
cluster protocols (which are both big ticket items) has been completely
speced out.
There are many additional features verses the original document,
including hardlinks.
HAMMER2 is all I am working on this year so I expect to make good
progress, but it will probably still be July before we have anything
usable, and well into 2013 before the whole mess is implemented and
even later before the clustering is 100% stable.
However, I expect to be able to stabilize all non-cluster related features
in fairly short order. Even though HAMMER2 has a lot more features then
HAMMER1 the actual design is simpler than HAMMER1, with virtually no edge
cases to worry about (I spent 12+ months working edge cases out in
HAMMER1's B-Tree, for example... that won't be an issue for HAMMER2
development).
The work is being done in the 'hammer2' branch off the main dragonfly
repo in appropriate subdirs. Right now just vsrinivas and I but
hopefully enough will get fleshed out in a few months that other people
can help too.
Ok, here's what I have got.
HAMMER2 DESIGN DOCUMENT
Matthew Dillon
08-Feb-2012
dil...@backplane.com
* These features have been speced in the media structures.
* Implementation work has begun.
* A working filesystem with some features implemented is expected by July 2012.
* A fully functional filesystem with most (but not all) features is expected
by the end of 2012.
* All elements of the filesystem have been designed except for the freemap
(which isn't needed for initial work). 8MB per 2GB of filesystem
storage has been reserved for the freemap. The design of the freemap
is expected to be completely speced by mid-year.
* This is my only project this year. I'm not going to be doing any major
kernel bug hunting this year.
Feature List
* Multiple roots (allowing snapshots to be mounted). This is implemented
via the super-root concept. When mounting a HAMMER2 filesystem you specify
a device path and a directory name in the super-root.
* HAMMER1 had PFS's. HAMMER2 does not. Instead, in HAMMER2 any directory
in the tree can be configured as a PFS, causing all elements recursively
underneath that directory to become a part of that PFS.
* Writable snapshots. Any subdirectory tree can be snapshotted. Snapshots
show up in the super-root. It is possible to snapshot a subdirectory
and then later snapshot a parent of that subdirectory... really there are
no limitations here.
* Directory sub-hierarchy based quotas and space and inode usage tracking.
Any directory sub-tree, whether at a mount point or not, tracks aggregate
inode use and data space use. This is stored in the directory inode all
the way up the chain.
* Incremental queueless mirroring / mirroring-streams. Because HAMMER2 is
block-oriented and copy-on-write each blockref tracks both direct
modifications to the referenced data via (modify_tid) and indirect
modifications to the referenced data or any sub-tree via (mirror_tid).
This makes it possible to do an incremental scan of meta-data that covers
only changes made since the mirror_tid recorded in a prior-run.
This feature is also intended to be used to locate recently allocated
blocks and thus be able to fixup the freemap after a crash.
HAMMER2 mirroring works a bit differently than HAMMER1 mirroring in
that HAMMER2 does not keep track of 'deleted' records. Instead any
recursion by the mirroring code which finds that (modify_tid) has
been updated must also send the direct block table or indirect block
table state it winds up recursing through so the target can check
similar key ranges and locate elements to be deleted. This can be
avoided if the mirroring stream is mostly caught up in that very recent
deletions will be cached in memory and can be queried, allowing shorter
record deletions to be passed in the stream instead.
* Will support multiple compression algorithms configured on subdirectory
tree basis and on a file basis. Up to 64K block compression will be used.
Only compression ratios near powers of 2 that are at least 2:1 (e.g. 2:1,
4:1, 8:1, etc) will work in this scheme because physical block allocations
in HAMMER2 are always power-of-2.
Compression algorithm #0 will mean no compression and no zero-checking.
Compression algorithm #1 will mean zero-checking but no other compression.
Real compression will be supported starting with algorithm 2.
* Zero detection on write (writing all-zeros), which requires the data
buffer to be scanned, will be supported as compression algorithm #1.
This allows the writing of 0's