Let me elaborate a tiny bit more. Consider the write(2) syscall on Unix and likealooks. If it succeeds, it returns the number of bytes written. If it fails, it returns -1. One must sometimes learn the hard way that some bytes may have been written even in the case of failure, but that there is no way to know how many. Interesting!
I'm just trying to accelerate my learning process with Riak. Cheers, John On Jan 9, 2012, at 3:25 PM, John DeTreville wrote: > Thanks for the reply, which confirms what I expected. > > Let me explain why I asked. I have an application that my intuition says > would be a good match to Riak, but I don't trust my intuition since I've > never used Riak and I'm not sure I understand all of its failure modes. One > thing I'm trying is to work through a mental model-checking exercise—which I > might eventually turn over to a real model checker—which is making me wonder > about all the things that can go wrong. A failed write that is visible > anyway, either permanently or just for a while, is just one example. > > In the long run, it would be great if Riak were documented perfectly and > completely—and other other piece of software in the world too!—but in the > meanwhile I'm just trying to build my own mental model. I'd prefer, of > course, a mental model that does not depend on a detailed knowledge of Riak's > internal workings, enumberating only the preconditions and postconditions of > each operation. We'll see how far I can get.... > > Cheers, > John > > On Jan 9, 2012, at 2:38 PM, John DeTreville wrote: > >> Thanks you very much for your reply. Longer response to follow. >> >> Cheers, >> John >> >> On Jan 9, 2012, at 2:33 PM, Ryan Zezeski wrote: >> >>> John, >>> >>> To your first question, yes, it is possible that the client may receive a >>> failure response from Riak but the data could have persisted on some of the >>> nodes. This is because a single write to Riak is actually N writes to N >>> different partitions inside of Riak. These N writes are not atomic in >>> relation to each other. >>> >>> As for your second question, it depends on what happens between the time of >>> the "failed" write and the time the node(s) with the replicas go down. If >>> some form of anti-entropy is employed before the node failure then the >>> replicas should have been repaired and N copies should exist. Riak's main >>> form of anti-entropy is read repair that occurs at read time (we also have >>> a form of active anti-entropy between Riak clusters in our enterprise >>> offering). If the object is read before node failure then read-repair will >>> occur and repair all N replicas. >>> >>> An example might help. If N=3/W=2 and two partitions fail to write then >>> the overall request will fail but the remaining W is successful. If you >>> perform a read after this "failed" write then you may or may not see the >>> new value depending on the R value and which partitions respond to the >>> coordinator first. However, regardless what is returned by that read the >>> coordinator will stay alive a while longer in an attempt to perform >>> read-repair. If read-repair is successful then you should have N copies >>> and it will be like the write failure never occurred. If you hadn't >>> performed that read and the replicas hadn't been repaired and the node >>> containing the only replica went down and you did a read then you would get >>> the old value or a not_found (depending on if a value existed for that key >>> before the write). >>> >>> -Ryan >>> >>> >>> On Mon, Jan 9, 2012 at 12:32 AM, John DeTreville <j...@detreville.org> >>> wrote: >>> (An earlier post seems not to have gone through. My apologies in the >>> eventual case of a duplicate.) >>> >>> I'm thinking of using Riak to replace a large Oracle system, and I'm trying >>> to understand its guarantees. I have a few introductory questions; this is >>> the second of three. >>> >>> Imagine I do a write, and the write fails because it could not contact >>> enough hosts. Am I right to imagine that the write may actually have >>> persisted, and that the data might later be available for reading? Am I >>> also right to imagine that the data, once read, might later vanish due to >>> host failure, because it was persisted to fewer hosts than expected? >>> >>> Cheers, >>> John >>> _______________________________________________ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com