:> So it comes down to how much space you are willing to eat up to store :> the history, and what kind of granularity you will want for the history. : :OK - so it WILL be a 'tunable', then. :... :HAMMER cannot protect against all forms of human error - BUT - if it :inherently rebuilds more intelligently than the least-intelligent of :RAID1, it can greatly reduce the opportunity for that sort of 'accident' :to occur.
One idea I had was to number the records as they were layed down on disk, and validate the file or directory by determining that no records were missing. But that doesn't fly very well when things are deleted and replaced. Another idea, much easier to implement, is to have a way to guarentee that all the bits and pieces of the file had been found by creating a record which contains a CRC of the whole mess. One could have a 'whole file' CRC, or even a 'whole directory tree' CRC (as-of a particular timestamp). Since HAMMER is record oriented associating special records with inodes is uttery trivial. For archival storage one could then 'tag' a directory tree with such a record and have a way of validating that the directory tree had not become corrupted, or was recovered properly. For encryption one could 'tag' a directory tree or a file with an encryption label. Not implemented yet but a definite possibility. There are so many things we can do with HAMMER due to its record oriented nature. :> Ultimately it will be extremely efficient simply by the fact that :> there will be a balancer going through it and repacking it. :> :"... constantly, and in the background..." (I presume) In the background, for sure. Probably not constantly, but taking a piece at a time with a nightly cron job. One thing I've learned over the years is that it is a bad idea to just go randomly accessing the disk at unexpected times. The nice thing is that the balancing can occur on a cluster-by-cluster basis, so one can do a bunch of clusters, then stop, then do a bunch more, then stop, etc. :Is variable-length still likely to have a payback if the data records :were to be fixed at 512B or 1024B or integer multiples thereof? Not a good idea for HAMMER. A HAMMER record is 96 bytes and a HAMMER B-Tree element is 56 bytes. That's 152 bytes of overhead per record. The smaller the data associated with each record, the larger the overhead and the less efficient the filesystem storage model. Also, while accessing records is localized, you only reap major benefits over a linear block storage scheme if you can make those records reference a significant amount of data. So for large static files we definitely want to use a large per-record data size, and for small static files we want to use a small data size. Theoretically the best-case storage for a tiny file would be 96 + 56 + 128 (inode data) + 64 (data), or 344 bytes of disk space. That's very, very good. (In the current incarnation the minimum disk space use per file is 96 + 56 + 128 + 16384). -Matt Matthew Dillon <[EMAIL PROTECTED]>