Hi, Alan.

Your replicas do in fact exist on both nodes.  However, I understand
that the situation you are observing is confusing.  I will attempt to
explain.

Quite some time ago, something surprising was noticed by some of our
users during their pre-production testing.  Some intentional failure
scenarios (with busted nodes, etc) would fail much more slowly when
R=1 than when R=2.  This was due to the fact that to satisfy a R=1
request with a non-object response (timeout or notfound), we would
wait for all N nodes to reply.  With R=2, we could send this response
as soon as N-1 nodes reply.  In some situations this is a dramatic
difference in time.

To remove this perceived problem we implemented what we refer to as
"basic quorum".  If a simple majority of vnodes have produced
non-successful internal replies, we return a non-success value such as
a notfound.  This means that if there is only one copy of the object
out there, and the node holding it is slowest to respond, the client
will not see that object in their response but will instead get the
notfound instead of waiting for the last node to respond or time out.

(note that read-repair will still occur in any case)

This could be avoided if we considered "not found" to be a success
condition, but then in the above situation you would see not founds
even with R=2.  That would simply be defined as another kind of
"successful" response.  Either way, it is a tradeoff of different
kinds of surprise.

I hope that this explanation helps with your understanding.

On another note, it's not useful to run Riak with a number of physical
hosts less than your N value unless you're planning on expanding it
soon.  So: testing with 2 hosts and N=3 means that you are testing
against a very much not-recommended configuration.  I suggest either
using more hosts or else changing your default bucket N value to 2.

-Justin


On Mon, Jun 14, 2010 at 1:59 PM, Alan McConnell <[email protected]> wrote:
> Hey Dan,
> I have a 2-node cluster with default bucket settings (N=3, etc.), and if I
> take one of the boxes down (and perform reads with R=1) I get tons of "key
> not found" errors for keys I know exist in the cluster.  Seems like for many
> keys, all 3 replicas live on one host.  From what you've written here
> though, it seems like that should not happen.  Do you know of any way my
> cluster could have gotten into this state?
> I did run a restore on this cluster using a riak-admin backup from a
> different, single-node cluster.  I wonder if that caused an uneven
> distribution.
> Any help would be appreciated.  As it stands now our 2-node cluster has
> serious read problems if either node goes down.
> -Alan

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to