Hi.

Short version:

Can I "drop the last segment", if it only contains orphans?

Long version:

I've got kind of a "mental dead-lock" trying to decide how to store data 
for my new project. Currently, I'm considering if I can use Lightning 
Memory-Mapped Database (LMDB), together with Cap'n Proto.

I've seen "mmap-friendly mutable storage format" in the RoadMap, but I'm 
not sure what it precisely means.

My concept would look about like this:

1) To be able to store multiple "tables" and "indices" and update them in a 
transactional way in LMDB, they must all be in the same LMDB file. So, I 
would first add a fixed-size prefix (i.e. 2 bytes) to my keys, as a 
table/index-ID.
2) Then comes the row/entity ID, say 8 bytes.
3) Then comes the *segment ID* (4 bytes, AFAIK, based on the Cap'n Proto 
documentation).

That way, I'm hoping I could spread a multi-segment Cap'n Proto message 
over multiple keys in LMDB, and be able to both read and update my data 
while limiting the need to make memory copies. I haven't actually tried 
this, so it's all theoretical, but it "sounds possible" (correct me if I'm 
wrong).

Currently, the main problem I see, is that you cannot append to lists. You 
have to replace the list with a bigger one, thereby creating an orphan, 
thereby "wasting memory". Or possibly, if that fits your use-case, create a 
"linked-list" of the lists, one after the other. But that way you loose 
O(1) access to lists elements, as you first have to iterate over the 
"linked list", to get at the desired list.

One way for me to map the data of my "largest" entity, and the changes to 
it, would be to have a variable-size map (key-value pair), containing 
"changes".

I would start by allocating a segment that would be exactly the right size 
for all the data, without the changes map.

Then I would add a "change map" with some small start size, for example 64 
entries. This would cause the creation of a new segment (and a second 
key-value pair in the DB). When the changes map fills, I would make a 
temporary copy, make the map an orphan, drop the last segment (which only 
contains an orphan), create a new map twice the size, causing the 
allocation of a new segment, and copy the temp data back into it.

While the "changes map" would need an occasional copy, the main part of the 
message, the first segment, would not, and by "replacing" the last segment, 
I would not be "wasting memory", while also not needing to make a full copy 
of the message to eliminate space wasted by orphans.

So,

A) Can I "drop the last segment", if it only contains orphans?
B) Is there any obvious reason I could not (effectively) use Cap'n Proto 
with a memory-mapped DB, assuming A) is possible, and my "entities" have 
maximum one "variable-sized" part?

Regards,
Sebastien Diot

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Reply via email to