Quoth [email protected]:

> Just a question at the start: Presumably, I have to
prepare arena space and indexsects by the original utilities, or else?

Ah, yes.

> Perhaps you remember that I am a critic of the index handling in the
original venti, because it will rewrite an entire disk block at random
locations for every clump added.  I would never use it on a SSD, only
on magnetic discs.

That - is a very fair point.

I'd be happy to discuss what the next version of the index should look
like.  My current goal is for neoventi to be, basically, compatible
and mostly-usable by the EOY, and then work on improved arena and
index formats as the next step.

I'm not sure how to avoid it with the index.  Bakcing away from
venti's implementation details for a second, let's state the
requirements, and then look at the problems

- We're storing a hash table on disk - The data blocks should be
stored in an append-only fashion

A lot of the useful properties come from those two designed-in
properties: deduplication, rapid sync, the ability to fork and reset a
file system, etc.

Even more simply: we want to store data in an append-only log, in
which the data is identified by its contents, not by its address.

That means, by necessity, that it is _not possible_ to know the
address of a given block, from its score!

The existence of an index is unavoidable.  The question is, can we
design one that does not require random writes, without unacceptable
trade offs?

Which is almost a trick question, as venti, in fact, already has one:
each arena contains a shortened summary that can be used to quickly
build the index on startup, without scanning the entire arena
contents.

But, if we relied on that to start up - and not as a backup with which
to construct the index - we'd be unable to handle any reads until we'd
read the entire thing from disk.  The current index is random-access,
which means we can be handling requests _without reading the whole
thing into memory._

We could maybe combine the two?

Have an index section on disk that's just a linear log of (key, value)
entries, that we load on startup - which would be faster than the
current arena scan, as it would be one single massive section to read
- and on startup, construct the entire index in memory from it?

...and, tbh, with how much faster neoventi already is, I can likely
experiment with just loading those arena sections on startup.  It
might be sufficiently fast anyways.  For 1TiB of data, that would be
~10GiB, and still _mostly_ linear [there's seeks between sectoins, but
sections are each a few MiB - so we might reasonably be able to hit,
say, 50MB/s on a HDD, or ~3 minutes?]

and that might be pretty fast on a modern drive, too, and eliminate
the random writes entirely?

As you can probalby see, I have a lot of _ideas_ on how to do this,
but - ideas are easy, code is easy, good design is hard.

- Noam Preil

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tb88459bd8ed4d095-M2977805c85bddb876a811d44
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to