date:20120208

DESIGN document for HAMMER2 (08-Feb-2012 update)

2012-02-08 Thread Matthew Dillon

This is the current design document for HAMMER2.  It lists every feature
I intend to implement for HAMMER2.  Everything except the freemap and
cluster protocols (which are both big ticket items) has been completely
speced out.

There are many additional features verses the original document,
including hardlinks.

HAMMER2 is all I am working on this year so I expect to make good
progress, but it will probably still be July before we have anything
usable, and well into 2013 before the whole mess is implemented and
even later before the clustering is 100% stable.

However, I expect to be able to stabilize all non-cluster related features
in fairly short order.  Even though HAMMER2 has a lot more features then
HAMMER1 the actual design is simpler than HAMMER1, with virtually no edge
cases to worry about (I spent 12+ months working edge cases out in
HAMMER1's B-Tree, for example... that won't be an issue for HAMMER2
development).

The work is being done in the 'hammer2' branch off the main dragonfly
repo in appropriate subdirs.  Right now just vsrinivas and I but
hopefully enough will get fleshed out in a few months that other people
can help too.

Ok, here's what I have got.


HAMMER2 DESIGN DOCUMENT

Matthew Dillon
 08-Feb-2012
 dil...@backplane.com


* These features have been speced in the media structures.

* Implementation work has begun.

* A working filesystem with some features implemented is expected by July 2012.

* A fully functional filesystem with most (but not all) features is expected
  by the end of 2012.

* All elements of the filesystem have been designed except for the freemap
  (which isn't needed for initial work).  8MB per 2GB of filesystem
  storage has been reserved for the freemap.  The design of the freemap
  is expected to be completely speced by mid-year.

* This is my only project this year.  I'm not going to be doing any major
  kernel bug hunting this year.

Feature List

* Multiple roots (allowing snapshots to be mounted).  This is implemented
  via the super-root concept.  When mounting a HAMMER2 filesystem you specify
  a device path and a directory name in the super-root.

* HAMMER1 had PFS's.  HAMMER2 does not.  Instead, in HAMMER2 any directory
  in the tree can be configured as a PFS, causing all elements recursively
  underneath that directory to become a part of that PFS.

* Writable snapshots.  Any subdirectory tree can be snapshotted.  Snapshots
  show up in the super-root.  It is possible to snapshot a subdirectory
  and then later snapshot a parent of that subdirectory... really there are
  no limitations here.

* Directory sub-hierarchy based quotas and space and inode usage tracking.
  Any directory sub-tree, whether at a mount point or not, tracks aggregate
  inode use and data space use.  This is stored in the directory inode all
  the way up the chain.

* Incremental queueless mirroring / mirroring-streams.  Because HAMMER2 is
  block-oriented and copy-on-write each blockref tracks both direct
  modifications to the referenced data via (modify_tid) and indirect
  modifications to the referenced data or any sub-tree via (mirror_tid).
  This makes it possible to do an incremental scan of meta-data that covers
  only changes made since the mirror_tid recorded in a prior-run.

  This feature is also intended to be used to locate recently allocated
  blocks and thus be able to fixup the freemap after a crash.

  HAMMER2 mirroring works a bit differently than HAMMER1 mirroring in
  that HAMMER2 does not keep track of 'deleted' records.  Instead any
  recursion by the mirroring code which finds that (modify_tid) has
  been updated must also send the direct block table or indirect block
  table state it winds up recursing through so the target can check
  similar key ranges and locate elements to be deleted.  This can be
  avoided if the mirroring stream is mostly caught up in that very recent
  deletions will be cached in memory and can be queried, allowing shorter
  record deletions to be passed in the stream instead.

* Will support multiple compression algorithms configured on subdirectory
  tree basis and on a file basis.  Up to 64K block compression will be used.
  Only compression ratios near powers of 2 that are at least 2:1 (e.g. 2:1,
  4:1, 8:1, etc) will work in this scheme because physical block allocations
  in HAMMER2 are always power-of-2.

  Compression algorithm #0 will mean no compression and no zero-checking.
  Compression algorithm #1 will mean zero-checking but no other compression.
  Real compression will be supported starting with algorithm 2.

* Zero detection on write (writing all-zeros), which requires the data
  buffer to be scanned, will be supported as compression algorithm #1.
  This allows the writing of 0's

hammer2 branch in dragonfly repo created - won't be operational for 6-12 months.

2012-02-08 Thread Matthew Dillon

I have created a hammer2 branch in the main repo so related commit messages
are going to start showing up in the commits@ list.  This branch will loosely
track master but also contain the hammer2 bits that we are working on.

The initial commit this branch contains mostly non-compilable specifications
work and header files.

hammer2 is NOT expected to be operational for at least 6 months, so don't get
your hopes up for it becoming available any time soon.  Once it becomes
operational most of the features are NOT expected to be in place until the
end of the year (hardlinks probably being one of those features that will
happen last).

At some point starting at around 6 months, when all the basics are working
and the media structures are stable, it will be possible to split the
workload up for remaining features.  I'll be posting another followup in
a few minutes on the design work done since the last posting.

-Matt
Matthew Dillon

DESIGN document for HAMMER2 (08-Feb-2012 update)

hammer2 branch in dragonfly repo created - won't be operational for 6-12 months.

2 matches

Site Navigation

Mail list logo

Footer information