Simon,

We use a similar approach, using 2i and batch numbers (every bucket/key we are interested is stamped with a 2i batch number which increases once a minute), a Java client that copies to two different clusters once a minute also, from "last batch number copied" to "current batch number - 1", so it is always a minute behind, like I said it uses 2i (current batch number is read only for target clusters and read-write on the target clusters), and it also copies keys concurrently.

Though we have Riak EE, we felt the same about having a fine grained cluster copy tool.

With that tool in place we don't need post-commits nor sync-ed operation on the "master cluster", we don't actually call it master cluster, we merely use the tool to have backups. There is tracking key that holds the batch number value and a list of buckets to operate.

Hope that helps,

Guido.

On 09/04/13 11:36, Simon Majou wrote:
Ryan,

Yes I know about Riak Enterprise, but I am thinking about a more "fine-grained" multi-cluster, by setting a X-Meta-primary-cluster property on each value.

In that scenario, the object would be written always on the primary cluster, and read on both clusters. In case of failure all the writes would be done on the active cluster, with the primary-cluster updated to the new primary.

From your explanation on the post commit hook I understand that I will need a message queue between the cluster, in order to free the hook as fast as possible, and to not lose any updates in case of crash of one cluster.


Simon


On Tue, Apr 9, 2013 at 3:55 AM, Ryan Zezeski <[email protected] <mailto:[email protected]>> wrote:

    Simon,


    On Mon, Apr 8, 2013 at 7:14 PM, Simon Majou <[email protected]
    <mailto:[email protected]>> wrote:

        Hello,

        I want to sync a bucket of a first cluster with the bucket of
        a second cluster. To do that I think using the post commit hook.


    If you didn't know, this is exactly what Riak Enterprise was built
    to do.  I.e. handle multi-cluster replication.  However, if you
    want to give it a go on your own a post-commit hook is one way to
    get the job done.  You'll want to think through failure scenarios
    where the receiving cluster is down and how to deal with msgs that
    are dropped between clusters.  The post-commit hook runs on a
    process called the "coordinator", there is a coordinator for every
    incoming request.  So you won't block the vnodes, which is
    important, but the client/user request will block until your
    post-commit returns.


        Is there any risk that the sequence of PUTs to be mixed in
        such a scenario ?


    Do you mean the sequence seen on cluster A vs. cluster B?  Are you
    asking if the object could appear to be on B before A even though
    the PUT was sent to A?  The answer is, it depends.  With a healthy
    system it's probably unlikely but it will depend on your DW values
    and state of each cluster.  E.g. if cluster A nodes get slow disk
    I/O then perhaps the replication to cluster B could beat writes on
    A.  If we start introducing node and network failures, or changing
    W/DW values then things can get more complicated.  You could have
    success on cluster A, fire replica to cluster B, all primary nodes
    for that object on cluster A die, now cluster B will have a key
    for which cluster A says not_found (well, not totally true,
    depends on your PR value).

    -Z





_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to