Re: Post commit hooks are a single process, so they are executed in the same order as the commits ?

Guido Medina Tue, 09 Apr 2013 04:02:21 -0700

Simon,

We use a similar approach, using 2i and batch numbers (everybucket/key we are interested is stamped with a 2i batch number whichincreases once a minute), a Java client that copies to two differentclusters once a minute also, from "last batch number copied" to "currentbatch number - 1", so it is always a minute behind, like I said it uses2i (current batch number is read only for target clusters and read-writeon the target clusters), and it also copies keys concurrently.

Though we have Riak EE, we felt the same about having a fine grainedcluster copy tool.

With that tool in place we don't need post-commits nor sync-ed operationon the "master cluster", we don't actually call it master cluster, wemerely use the tool to have backups. There is tracking key that holdsthe batch number value and a list of buckets to operate.


Hope that helps,

Guido.

On 09/04/13 11:36, Simon Majou wrote:

Ryan,

Yes I know about Riak Enterprise, but I am thinking about a more"fine-grained" multi-cluster, by setting a X-Meta-primary-clusterproperty on each value.

In that scenario, the object would be written always on the primarycluster, and read on both clusters.In case of failure all the writes would be done on the active cluster,with the primary-cluster updated to the new primary.

From your explanation on the post commit hook I understand that I willneed a message queue between the cluster, in order to free the hook asfast as possible, and to not lose any updates in case of crash of onecluster.



Simon

On Tue, Apr 9, 2013 at 3:55 AM, Ryan Zezeski <[email protected]<mailto:[email protected]>> wrote:


    Simon,


    On Mon, Apr 8, 2013 at 7:14 PM, Simon Majou <[email protected]
    <mailto:[email protected]>> wrote:

        Hello,

        I want to sync a bucket of a first cluster with the bucket of
        a second cluster. To do that I think using the post commit hook.


    If you didn't know, this is exactly what Riak Enterprise was built
    to do.  I.e. handle multi-cluster replication.  However, if you
    want to give it a go on your own a post-commit hook is one way to
    get the job done.  You'll want to think through failure scenarios
    where the receiving cluster is down and how to deal with msgs that
    are dropped between clusters.  The post-commit hook runs on a
    process called the "coordinator", there is a coordinator for every
    incoming request.  So you won't block the vnodes, which is
    important, but the client/user request will block until your
    post-commit returns.


        Is there any risk that the sequence of PUTs to be mixed in
        such a scenario ?


    Do you mean the sequence seen on cluster A vs. cluster B?  Are you
    asking if the object could appear to be on B before A even though
    the PUT was sent to A?  The answer is, it depends.  With a healthy
    system it's probably unlikely but it will depend on your DW values
    and state of each cluster.  E.g. if cluster A nodes get slow disk
    I/O then perhaps the replication to cluster B could beat writes on
    A.  If we start introducing node and network failures, or changing
    W/DW values then things can get more complicated.  You could have
    success on cluster A, fire replica to cluster B, all primary nodes
    for that object on cluster A die, now cluster B will have a key
    for which cluster A says not_found (well, not totally true,
    depends on your PR value).

    -Z





_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Post commit hooks are a single process, so they are executed in the same order as the commits ?

Reply via email to