Have you guys looked at how Hadoop's Hbase is designed? I wonder if perhaps the bigtable or hbase architecture may help?

http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

I am still wavering on is jcr a database perhaps? I wonder if there is a need to invent a new storage mechanism when there are some many of them out there already. I keep flip flopping ;-)

-paddy

Thomas Mueller wrote:
Hi,

I fully agree with Marcel: in my view a mix of "journal" and "store"
would be the best solution. The "store" solution has the advantage
that unused items can be deleted without affecting other items (or
leave unused space in a file).

The line between "journal" and "store" can be fluid: an item could
move from one state to the other. The approaches can be unified: let's
say each item is identified with:

fileId, position

If multiple items are stored in a file, the file name should be short
(for example an short) and the position relatively large (for example
file offset / 16, if we want to use 16 byte blocks). For items that
are stored in a data store, the fileId would be large (the hash code)
and the position always 0. It's possible to add a third storage type
(for example for 'data that didn't change for a long time').

The length can be stored in the file item itself, but sometimes it may
be better to store the length in the index, for example length 0 could
mean an item is deleted.

Regards,
Thomas


Reply via email to