On 2/23/2011 10:39 AM, Ryan Zezeski wrote:
Thanks! A couple more somewhat related questions: is that atomic
update nature hard to duplicate outside of luwak (say by a client
that needs to keep several items in sync), and if the luwak blocks
are immutable, how do you ever clean up the space used by data that
has been deleted or modified and no longer referenced?
For your first question, I believe the onus to make multiple object
updates atomic is on you, the application developer. One of the,
perhaps easier, ways to achieve this would be to wrap all the data in
one object?
And luwak accomplishes this by putting the list of keys comprising the
whole stream in one object that is updated last, so a reader will get
one or the other?
Second, you don't; not at this time at least. Luwak allows you to
delete the file reference, but not the data itself. It's the very
nature of the fact that it's an immutable, persistent data structure
that makes this so. If two files share a block, then you can't simply
delete the blocks under a file, but instead must perform something more
like garbage collection.
If I understand what is going on correctly, you'd have to maintain a
reference count atomically with the keys since files with duplicate
sections would reuse some data blocks. Hmmm, that makes it sound like a
really good place to throw backups for a dedup effect...
If you're up for it, I have some proof of concept code on my fork of
Luwak. I got GC to work, to an extent. IIRC, once I got past 10-15GB
things started to degrade quickly. When I get more time I plan to
return to it.
What is the trick to knowing if a block is currently referenced or not?
Would it be possible to have some sort of bucket versioning and
periodically copy currently-referenced blocks forward, flip the bucket
reference and drop everything in the old bucket? I guess you'd still
have to deal with possible re-use during the copy.
--
Les Mikesell
[email protected]
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com