Quoth [email protected]: > Just a question at the start: Presumably, I have to prepare arena space and indexsects by the original utilities, or else?
Ah, yes. > Perhaps you remember that I am a critic of the index handling in the original venti, because it will rewrite an entire disk block at random locations for every clump added. I would never use it on a SSD, only on magnetic discs. That - is a very fair point. I'd be happy to discuss what the next version of the index should look like. My current goal is for neoventi to be, basically, compatible and mostly-usable by the EOY, and then work on improved arena and index formats as the next step. I'm not sure how to avoid it with the index. Bakcing away from venti's implementation details for a second, let's state the requirements, and then look at the problems - We're storing a hash table on disk - The data blocks should be stored in an append-only fashion A lot of the useful properties come from those two designed-in properties: deduplication, rapid sync, the ability to fork and reset a file system, etc. Even more simply: we want to store data in an append-only log, in which the data is identified by its contents, not by its address. That means, by necessity, that it is _not possible_ to know the address of a given block, from its score! The existence of an index is unavoidable. The question is, can we design one that does not require random writes, without unacceptable trade offs? Which is almost a trick question, as venti, in fact, already has one: each arena contains a shortened summary that can be used to quickly build the index on startup, without scanning the entire arena contents. But, if we relied on that to start up - and not as a backup with which to construct the index - we'd be unable to handle any reads until we'd read the entire thing from disk. The current index is random-access, which means we can be handling requests _without reading the whole thing into memory._ We could maybe combine the two? Have an index section on disk that's just a linear log of (key, value) entries, that we load on startup - which would be faster than the current arena scan, as it would be one single massive section to read - and on startup, construct the entire index in memory from it? ...and, tbh, with how much faster neoventi already is, I can likely experiment with just loading those arena sections on startup. It might be sufficiently fast anyways. For 1TiB of data, that would be ~10GiB, and still _mostly_ linear [there's seeks between sectoins, but sections are each a few MiB - so we might reasonably be able to hit, say, 50MB/s on a HDD, or ~3 minutes?] and that might be pretty fast on a modern drive, too, and eliminate the random writes entirely? As you can probalby see, I have a lot of _ideas_ on how to do this, but - ideas are easy, code is easy, good design is hard. - Noam Preil ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Tb88459bd8ed4d095-M2977805c85bddb876a811d44 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
