Alexander, Your question about n_val on a one-node server is very valid (and also the question of, so how do you migrate to a larger n_val size when you grow your cluster).
As an aside -- as John mentioned, Riak is designed from the ground up to be run on multi-node clusters, so you have to keep that in mind when choosing to run on just one node (when you're a startup on a budget, and just testing out an application, etc), in terms of expected performance. Anyways, you have two options: 1) Start with the eventual n_val in mind (n=3, or n=2 as in your case) and live with slow performance on a single node. (The upside is - no migration required when adding new nodes, as the extra replicas will be moved to the appropriate new machines). 2) Start with n_val=1 on a single node. The benefit of this is - faster performance (less replicas to deal with that don't help when you're on one node). The drawback is - you need to have some sort of migration strategy when expanding your Riak deployment to more nodes. It doesn't have to be complicated, but you will have to deal with the fact that, if you have a set of data with n=1, and then increase n to 2 in the app config when you add a new node, half of the vnodes are going to be missing data for any given request. This is not disastrous, but you do need to either rely on read-repair to create the missing replicas, or you need to do it yourself. Here are the options as I see it: a) If you can afford downtime when adding a new cluster, you could back up the contents of your one-node cluster, then add the new node & up n_val to 2. And then restore to the new cluster. The writes from the restore are going to create the new number of replicas (n_val=2). You can you logical backup tools like 'riak-admin backup' or Riak Data Migrator. (Backing up and restoring the data directory won't work, for adding a new node). b) You can add retries to your application logic. If you get a Not Found (and you know that the value is supposed to be there), you can re-try the GET. (After the first get and a 404, read-repair takes place to fill in missing values, so the second read should be fine). c) If you're on 1.3+ and have Active Anti-Entropy enabled, you can increase n_val to 2, and wait for AAE to fill in the missing replicas (this should probably still be paired with option b, as you'll need to retry some reads while it's working). The few times that I've built single-node apps (during hackathons, etc), I usually go with option 2 (a), and backup/restore the data. But your use case requirements may difer. Dmitri On Mon, Jun 10, 2013 at 12:01 PM, Alexander Ilyin <[email protected]>wrote: > Hi all, > > I have a question regarding setting the n_val. > In the documentation ( > http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/) > it is stated that: > > n_val must be greater than 0 and less than or equal to the number of > actual nodes in your cluster to get all the benefits of replication. > And, we advise against modifying the n_val of a bucket after its initial > creation as this may result in failed reads because the new > value may not be replicated to all the appropriate partitions. > > But this seems contradictory to me. Which value I have to set if I'm > setting up one node right now but planning to add second one later? > Setting n_val=2 means this value will be greater than actual number of > nodes. But setting n_val=1 is also not advisable since > I will have to change it later to n_val=2 (I'm planning to have two > replicas in the end). > > I'm also concerned about the performance in case of one node and n_val=2. > Will it degrade since both replicas are stored on the same server? > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
