Hey Ben, I'm not aware of any off-the-shelf tool that does exactly what you're looking for. Riak's multi-dc replication is not synchronous (naive synchronous replication _reduces_ availability). In order to have synchronous multi-dc replication that doesn't decrease availability, you'd need to do some kind of multi-dc coordination with at least 3 data centers. Neither Riak nor Riak CS support this at the moment. It is also worth noting that Riak's replication is not 'done passively'. Real-time replication is done on every update, and then there is a daemon process called full-sync that runs to synchronize any lost realtime updates.
More comments inline: On Dec 10, 2012, at 11:20 AM, Ben Rowland <[email protected]> wrote: > Hi, > > We're building a distributed storage API, which must be highly available > across several globally distant data-centres. > > A specific requirement is that a user of the API must see a write as > successful iff the data has been successfully written to a node in at least 2 > data-centres. This is to avoid the case where one data-centre becomes > unavailable a short time after being written to, e.g. due to a net-split or > power outage. > > I understand there is multi data-centre support in Riak CS, but I believe > Real-Time Sync is done passively (on next read) and this leaves a loop-hole > where the cluster written to could immediately go off-line. nope, updates are replicated to the other data centers right away > > In theory, we could meet our requirement by having 2 physical nodes in each > cluster (DC), and using a N-value of 3. This would force writes to go to > nodes in at least 2 clusters. Are there any particular reasons against this? > I realise there will be a problem with latency but couldn't we just increase > connection time-outs? I also suspect SSL wouldn't be supported when writing > within a cluster which would be an issue. Running a _single_ Riak cluster across multiple data centers is possible, but not something we ever test, optimize or recommend. If you _really_ need a _guarantee_ that data has been written to at least two DCs, > > Is Riak CS the right tool to meet these requirements? Riak CS has the same multi-dc replication semantics as Riak, with the addition of something called 'proxy-get', which allows objects to be fetched from a remote cluster if they haven't been written locally yet. > > If it helps, our use-case is perhaps simpler than many in that all objects > are write-once. So we don't need to worry about things like split-brain, > just adequate replication. Immutability will definitely make your life easier, though it's not a silver bullet for solving the 'synchronous replication w/out lowering availability' problem. > > Many thanks, > > Ben > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
