Alexander,

Your question about n_val on a one-node server is very valid (and also the
question of, so how do you migrate to a larger n_val size when you grow
your cluster).

As an aside -- as John mentioned, Riak is designed from the ground up to be
run on multi-node clusters, so you have to keep that in mind when choosing
to run on just one node (when you're a startup on a budget, and just
testing out an application, etc), in terms of expected performance.

Anyways, you have two options:

1) Start with the eventual n_val in mind (n=3, or n=2 as in your case) and
live with slow performance on a single node. (The upside is - no migration
required when adding new nodes, as the extra replicas will be moved to the
appropriate new machines).

2) Start with n_val=1 on a single node. The benefit of this is - faster
performance (less replicas to deal with that don't help when you're on one
node). The drawback is - you need to have some sort of migration strategy
when expanding your Riak deployment to more nodes.
It doesn't have to be complicated, but you will have to deal with the fact
that, if you have a set of data with n=1, and then increase n to 2 in the
app config when you add a new node, half of the vnodes are going to be
missing data for any given request. This is not disastrous, but you do need
to either rely on read-repair to create the missing replicas, or you need
to do it yourself.
Here are the options as I see it:

a) If you can afford downtime when adding a new cluster, you could back up
the contents of your one-node cluster, then add the new node & up n_val to
2. And then restore to the new cluster. The writes from the restore are
going to create the new number of replicas (n_val=2). You can you logical
backup tools like 'riak-admin backup' or Riak Data Migrator. (Backing up
and restoring the data directory won't work, for adding a new node).

b) You can add retries to your application logic. If you get a Not Found
(and you know that the value is supposed to be there), you can re-try the
GET. (After the first get and a 404, read-repair takes place to fill in
missing values, so the second read should be fine).

c) If you're on 1.3+ and have Active Anti-Entropy enabled, you can increase
n_val to 2, and wait for AAE to fill in the missing replicas (this should
probably still be paired with option b, as you'll need to retry some reads
while it's working).

The few times that I've built single-node apps (during hackathons, etc), I
usually go with option 2 (a), and backup/restore the data. But your use
case requirements may difer.

Dmitri



On Mon, Jun 10, 2013 at 12:01 PM, Alexander Ilyin <[email protected]>wrote:

> Hi all,
>
> I have a question regarding setting the n_val.
> In the documentation (
> http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/)
> it is stated that:
>
> n_val must be greater than 0 and less than or equal to the number of
> actual nodes in your cluster to get all the benefits of replication.
> And, we advise against modifying the n_val of a bucket after its initial
> creation as this may result in failed reads because the new
> value may not be replicated to all the appropriate partitions.
>
> But this seems contradictory to me. Which value I have to set if I'm
> setting up one node right now but planning to add second one later?
> Setting n_val=2 means this value will be greater than actual number of
> nodes. But setting n_val=1 is also not advisable since
> I will have to change it later to n_val=2 (I'm planning to have two
> replicas in the end).
>
> I'm also concerned about the performance in case of one node and n_val=2.
> Will it degrade since both replicas are stored on the same server?
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to