I just saw https://issues.basho.com/show_bug.cgi?id=275 , which would actually be just what I need.
Am Mittwoch, den 04.08.2010, 13:58 +0200 schrieb Nico Meyer: > Hi Justin, > > I think we are coming from two different directions here, leading to > some confusion. You seem to treat a get for a non existing key as an > error, in which case all your points are valid of course. I suspected > that this is the reason for the current design choice, but I didn't see > it stated anywhere explicitly. And also notfound seems to be handled > differently from other types of errors, at least in the way it is > signalled to the client, so I didn't immediately think of it as an error > case. > On the other hand there are many applications where asking for a key > that has never been put is perfectly valid, an not_found is indeed the > right answer in that case. Our application is an example of that. The > key is given (it is a unique cookie ID), and we need to check if we saw > a specific ID before in a certain context and if so get some data that > was associated with the ID back then. More often than not this is not > the case, so notfound is the expected answer. > > If you read my original mail again with that use case in mind it might > become clearer what my problem with the current design is. > > Having to fulfil the precondition that we only do gets for keys we know > to have been put before would require another datastore for that > purpose, which seems kind of akward and unnecessary, since riak has all > the required data to handle our use case. > > Please let me know if I need to further clarify my thoughts about this. > English is not my first language and its hard enough to reason about > these things in German and face-to-face :-). > > Cheers, > Nico > > Am Montag, den 02.08.2010, 22:29 -0400 schrieb Justin Sheehy: > > Hi, Nico. > > > > On Mon, Aug 2, 2010 at 1:19 PM, Nico Meyer <[email protected]> wrote: > > > > > What I mean is, if I do a get request for a key with R=N, and one of the > > > first N nodes in the preflist is down the request will still succeed. > > > Why is that? Doesn't that undermine the purpose of seting R to a high > > > number (specifically setting it to N)? That way a request might succeed > > > even if all primary nodes responsible for the key are unavailable. > > > > You are correct, and this is intentional. There is nothing in the R > > or W settings that is intended to indicate anything at all about > > "primary" nodes. It is rather simply the number of successful > > responses that the client wishes to wait for, and thus the degree of > > quorum sought before a client reply is sent. Using fallback nodes to > > satisfy reads is a natural result of using fallback nodes to satisfy > > writes. > > > > > > > > If all primary nodes responsible for a key are unavailable, but enough > > of the fallback nodes for that key have received a value for that key > > since they went unavailable (through a fallback write) then a request > > to get that key might succeed. I am not sure why you see this as a > > bad thing. > > > > (It will only succeed if R nodes actually provide a successful result, > > not just if they are available.) > > > > > On a similar note, why is the riak_kv_get_fsm waiting for at least > > > (N/2)+1 responses, if there are only not_found responses, effectively > > > ignoring a smaller R value of the request if the key does not exists? > > > > This is a compromise to deal with real situations that can occur where > > a single node might be taking a very long time to reply, and a value > > has never been stored for a given key. Without either this basic > > quorum default for notfounds or alternately considering a notfound as > > success and thus only waiting for R of them, that situation would mean > > that an R=1 request would take much longer to complete than an R=2 > > request (due to waiting for the slow node) which is confusing to most > > users. Note that since it applies to notfounds, this tends to only > > come into play for items that have never been successfully stored with > > at least a basic quorum -- things that really are not present, that > > is. > > > > > My guess was, that this also has to do with the use of fallback nodes: > > > Since the partition will usually be very small on the fallback/handoff > > > node, it is likely to be the first to answer. So to avoid returning > > > false not_found responses, a basic quorum is required. > > > Am I on the right track here? > > > > It doesn't have anything to do with fallback nodes explicitly. It is > > for situations where a node is under any condition that will slow it > > down significantly. In such situations, there is little to be gained > > in waiting for all N replies if (N/2)+1 have already declared > > notfound. > > > > > The problem is, this is imposed even for the case that all nodes are up. > > > If one requires very low latency or very high availability (that's why > > > one uses a small R value in the first place) and does a lot of gets for > > > non existent keys, riak silently screws you over by raising R for those > > > keys. > > > > It seems that there is something here worth clarifying. If you are > > issuing requests with W+R<=N, and some reads following writes return > > notfound during an interval immediately following initial storage > > time... well, that's what you asked for by not requesting a quorum. > > If you store the object with a sufficiently high W value first, then > > you will not get this sort of notfound response even if your R value > > is only 1. > > > > I suppose that providing the freedom to do this might be considered > > "screwing you over," but we see it more as allowing you to make > > different choices while still providing safe and unsurprising default > > behavior. If you try hard enough to screw yourself over, though, Riak > > won't stop you. If you issue write requests (to any dynamo-model > > system) with some W, followed immediately by a read request with some > > R, and W+R is not greater than N, you should not be expecting the > > write to necessarily be reflected yet. > > > > > I most likely missed something here, but some ad hoc test I did seem to > > > be consistent with my understanding of the code. > > > > You have certainly put some real effort into understanding some > > choices made in the Riak code, which I appreciate. I hope that I have > > helped to extend your understanding of the real operational scenarios > > that have motivated those choices, and how the code will behave in > > those scenarios. > > > > Best, > > > > -Justin > > > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
