Hey Ben,

I'm not aware of any off-the-shelf tool that does exactly what you're looking 
for. Riak's multi-dc replication
is not synchronous (naive synchronous replication _reduces_ availability). In 
order to have synchronous multi-dc
replication that doesn't decrease availability, you'd need to do some kind of 
multi-dc coordination with at least
3 data centers. Neither Riak nor Riak CS support this at the moment. It is also 
worth noting that Riak's replication
is not 'done passively'. Real-time replication is done on every update, and 
then there is a daemon process
called full-sync that runs to synchronize any lost realtime updates.

More comments inline:

On Dec 10, 2012, at 11:20 AM, Ben Rowland <[email protected]> wrote:

> Hi,
> 
> We're building a distributed storage API, which must be highly available 
> across several globally distant data-centres.  
> 
> A specific requirement is that a user of the API must see a write as 
> successful iff the data has been successfully written to a node in at least 2 
> data-centres.  This is to avoid the case where one data-centre becomes 
> unavailable a short time after being written to, e.g. due to a net-split or 
> power outage.  
> 
> I understand there is multi data-centre support in Riak CS, but I believe 
> Real-Time Sync is done passively (on next read) and this leaves a loop-hole 
> where the cluster written to could immediately go off-line.

nope, updates are replicated to the other data centers right away

> 
> In theory, we could meet our requirement by having 2 physical nodes in each 
> cluster (DC), and using a N-value of 3.  This would force writes to go to 
> nodes in at least 2 clusters.  Are there any particular reasons against this? 
>  I realise there will be a problem with latency but couldn't we just increase 
> connection time-outs?  I also suspect SSL wouldn't be supported when writing 
> within a cluster which would be an issue.

Running a _single_ Riak cluster across multiple data centers is possible, but 
not something we ever test, optimize
or recommend. If you _really_ need a _guarantee_ that data has been written to 
at least two DCs, 

> 
> Is Riak CS the right tool to meet these requirements?

Riak CS has the same multi-dc replication semantics as Riak, with the addition 
of something called 'proxy-get', which allows objects
to be fetched from a remote cluster if they haven't been written locally yet.

> 
> If it helps, our use-case is perhaps simpler than many in that all objects 
> are write-once.  So we don't need to worry about things like split-brain, 
> just adequate replication.

Immutability will definitely make your life easier, though it's not a silver 
bullet for solving the 'synchronous replication w/out lowering availability' 
problem.

> 
> Many thanks,
> 
> Ben
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to