You may want to have a look at memcachedb: http://memcachedb.org/
2009/8/11 Adam Lee <[email protected]>: > We have a medium-sized dataset (~50M entries) with small values (a few > hundred bytes) where we need "persistence" with a very high read throughput > and occasional updates. > To solve this, we built a cluster of memcached servers with enough RAM on > each machine to store the entire dataset and wrote our own memcached client > with the following characteristics: > - each write operation writes to every machine in the cluster > - each read operation reads from any one machine in the cluster > - if a machine becomes non-responsive, it is marked as dirty and removed > from the cluster list > Every night a "full populate" script is run and any new machines or machines > that have been removed throughout the day are re-added to the cluster. > With this setup, we achieve hundreds of thousands of reads per-second and > achieve virtual "persistence." > On Tue, Aug 11, 2009 at 4:56 PM, smolix <[email protected]> wrote: >> >> Hi Adam, >> >> Thanks for the tokyocabinet pointer. Unfortunately that would be too >> slow (we need as high iops as we can get and no, ssd would not be an >> answer unless it gets into FusionIO performance range). What was the >> hack you did? We don't need persistent storage for many days. The >> total computation will run in 1 maybe 2 days total. >> >> Take care, >> >> Alex >> >> On Aug 11, 12:37 pm, Adam Lee <[email protected]> wrote: >> > We do a hack that enables something similar to this, but I wouldn't >> > recommend it. If you want something memcached-like but persistent, you >> > should look into, for example, tokyocabinet. It even speaks memcached >> > protocol, so you can use it as a drop-in replacement and achieve the >> > desired >> > effect. It's not _as_ fast as memcached, but it's still very fast. >> > >> > >> > >> > On Tue, Aug 11, 2009 at 1:59 PM, smolix <[email protected]> wrote: >> > >> > > Hi, >> > >> > > Is there a way to use memcached as a _guaranteed_ distributed >> > > (key,value) storage? That is, I want to have a distributed storage of >> > > (key, value) pairs which can be accessed from many clients >> > > efficiently. The RAM is sufficient that all should easily fit into >> > > memory but I probably can't have an overhead of more than 2x the >> > > amount of data it takes to store the pairs. Is there a way to turn off >> > > the discard option in memcached? I can tune the keys such that they >> > > are sequential or do similar preprocessing if needed. >> > >> > > This is about 100-500GB of data that I need to store with values less >> > > than 4k per item (in some cases much smaller). >> > >> > > Any help and suggestions would be greatly appreciated. >> > >> > > Thanks, >> > >> > > Alex >> > >> > -- >> > awl > > > > -- > awl >
