Hi Alex, would a second fetch just indicate that the object is *still* deleted? Or that this delete operation succeeded? In other words, perhaps what my contract really is is: return true if there was already a value there. In which case would the second fetch be superfluous?
Thanks for your help. Vanessa On Mon, Feb 22, 2016 at 11:15 AM, Alex Moore <amo...@basho.com> wrote: > That's the correct behaviour: it should return true iff a value was >> actually deleted. > > > Ok, if that's the case you should do another FetchValue after the deletion > (to update the response.hasValues()) field, or use the async version of > the delete function. I also noticed that we weren't passing the vclock to > the Delete function, so I added that here as well: > > public boolean delete(String key) throws ExecutionException, > InterruptedException { > > // fetch in order to get the causal context > FetchValue.Response response = fetchValue(key); > > if(response.isNotFound()) > { > return ???; // what do we return if it doesn't exist? > } > > DeleteValue deleteValue = new DeleteValue.Builder(new Location(namespace, > key)) > > .withVClock(response.getVectorClock()) > .build(); > > final RiakFuture<Void, Location> deleteFuture = > client.executeAsync(deleteValue); > > deleteFuture.await(); > > if(deleteFuture.isSuccess()) > { > return true; > } > else > { > deleteFuture.cause(); // Cause of failure > return false; > } > } > > > Thanks, > Alex > > On Mon, Feb 22, 2016 at 10:48 AM, Vanessa Williams < > vanessa.willi...@thoughtwire.ca> wrote: > >> See inline: >> >> On Mon, Feb 22, 2016 at 10:31 AM, Alex Moore <amo...@basho.com> wrote: >> >>> Hi Vanessa, >>> >>> You might have a problem with your delete function (depending on it's >>> return value). >>> What does the return value of the delete() function indicate? Right now >>> if an object existed, and was deleted, the function will return true, and >>> will only return false if the object didn't exist or only consisted of >>> tombstones. >>> >> >> >> That's the correct behaviour: it should return true iff a value was >> actually deleted. >> >> >>> If you never look at the object value returned by your fetchValue(key) >>> function, another potential optimization you could make is to only return >>> the HEAD / metadata: >>> >>> FetchValue fv = new FetchValue.Builder(new Location(new Namespace( >>> "some_bucket"), key)) >>> >>> .withOption(FetchValue.Option.HEAD, true) >>> .build(); >>> >>> This would be more efficient because Riak won't have to send you the >>> values over the wire, if you only need the metadata. >>> >>> >> Thanks, I'll clean that up. >> >> >>> If you do write this up somewhere, share the link! :) >>> >> >> Will do! >> >> Regards, >> Vanessa >> >> >>> >>> Thanks, >>> Alex >>> >>> >>> On Mon, Feb 22, 2016 at 6:23 AM, Vanessa Williams < >>> vanessa.willi...@thoughtwire.ca> wrote: >>> >>>> Hi Dmitri, this thread is old, but I read this part of your answer >>>> carefully: >>>> >>>> You can use the following strategies to prevent stale values, in >>>>> increasing order of security/preference: >>>>> 1) Use timestamps (and not pass in vector clocks/causal context). This >>>>> is ok if you're not editing objects, or you're ok with a bit of risk of >>>>> stale values. >>>>> 2) Use causal context correctly (which means, read-before-you-write -- >>>>> in fact, the Update operation in the java client does this for you, I >>>>> think). And if Riak can't determine which version is correct, it will fall >>>>> back on timestamps. >>>>> 3) Turn on siblings, for that bucket or bucket type. That way, Riak >>>>> will still try to use causal context to decide the right value. But if it >>>>> can't decide, it will store BOTH values, and give them back to you on the >>>>> next read, so that your application can decide which is the correct one. >>>> >>>> >>>> I decided on strategy #2. What I am hoping for is some validation that >>>> the code we use to "get", "put", and "delete" is correct in that context, >>>> or if it could be simplified in some cases. Not we are using delete-mode >>>> "immediate" and no duplicates. >>>> >>>> In their shortest possible forms, here are the three methods I'd like >>>> some feedback on (note, they're being used in production and haven't caused >>>> any problems yet, however we have very few writes in production so the lack >>>> of problems doesn't support the conclusion that the implementation is >>>> correct.) Note all argument-checking, exception-handling, and logging >>>> removed for clarity. *I'm mostly concerned about correct use of causal >>>> context and response.isNotFound and response.hasValues. *Is there >>>> anything I could/should have left out? >>>> >>>> public RiakClient(String name, >>>> com.basho.riak.client.api.RiakClient client) >>>> { >>>> this.name = name; >>>> this.namespace = new Namespace(name); >>>> this.client = client; >>>> } >>>> >>>> public byte[] get(String key) throws ExecutionException, >>>> InterruptedException { >>>> >>>> FetchValue.Response response = fetchValue(key); >>>> if (!response.isNotFound()) >>>> { >>>> RiakObject riakObject = response.getValue(RiakObject.class); >>>> return riakObject.getValue().getValue(); >>>> } >>>> return null; >>>> } >>>> >>>> public void put(String key, byte[] value) throws >>>> ExecutionException, InterruptedException { >>>> >>>> // fetch in order to get the causal context >>>> FetchValue.Response response = fetchValue(key); >>>> RiakObject storeObject = new >>>> >>>> RiakObject().setValue(BinaryValue.create(value)).setContentType("binary/octet-stream"); >>>> StoreValue.Builder builder = >>>> new StoreValue.Builder(storeObject).withLocation(new >>>> Location(namespace, key)); >>>> if (response.getVectorClock() != null) { >>>> builder = >>>> builder.withVectorClock(response.getVectorClock()); >>>> } >>>> StoreValue storeValue = builder.build(); >>>> client.execute(storeValue); >>>> } >>>> >>>> public boolean delete(String key) throws ExecutionException, >>>> InterruptedException { >>>> >>>> // fetch in order to get the causal context >>>> FetchValue.Response response = fetchValue(key); >>>> if (!response.isNotFound()) >>>> { >>>> DeleteValue deleteValue = new DeleteValue.Builder(new >>>> Location(namespace, key)).build(); >>>> client.execute(deleteValue); >>>> } >>>> return !response.isNotFound() || !response.hasValues(); >>>> } >>>> >>>> >>>> Any comments much appreciated. I want to provide a minimally correct >>>> example of simple client code somewhere (GitHub, blog post, something...) >>>> so I don't want to post this without review. >>>> >>>> Thanks, >>>> Vanessa >>>> >>>> ThoughtWire Corporation >>>> http://www.thoughtwire.com >>>> >>>> >>>> >>>> >>>> On Thu, Oct 8, 2015 at 8:45 AM, Dmitri Zagidulin <dzagidu...@basho.com> >>>> wrote: >>>> >>>>> Hi Vanessa, >>>>> >>>>> The thing to keep in mind about read repair is -- it happens >>>>> asynchronously on every GET, but /after/ the results are returned to the >>>>> client. >>>>> >>>>> So, when you issue a GET with r=1, the coordinating node only waits >>>>> for 1 of the replicas before responding to the client with a success, and >>>>> only afterwards triggers read-repair. >>>>> >>>>> It's true that with notfound_ok=false, it'll wait for the first >>>>> non-missing replica before responding. But if you edit or update your >>>>> objects at all, an R=1 still gives you a risk of stale values being >>>>> returned. >>>>> >>>>> For example, say you write an object with value A. And let's say your >>>>> 3 replicas now look like this: >>>>> >>>>> replica 1: A, replica 2: A, replica 3: notfound/missing >>>>> >>>>> A read with an R=1 and notfound_ok=false is just fine, here. (Chances >>>>> are, the notfound replica will arrive first, but the notfound_ok setting >>>>> will force the coordinator to wait for the first non-empty value, A, and >>>>> return it to the client. And then trigger read-repair). >>>>> >>>>> But what happens if you edit that same object, and give it a new >>>>> value, B? So, now, there's a chance that your replicas will look like >>>>> this: >>>>> >>>>> replica 1: A, replica 2: B, replica 3: B. >>>>> >>>>> So now if you do a read with an R=1, there's a chance that replica 1, >>>>> with the old value of A, will arrive first, and that's the response that >>>>> will be returned to the client. >>>>> >>>>> Whereas, using R=2 eliminates that risk -- well, at least decreases >>>>> it. You still have the issue of -- how does Riak decide whether A or B is >>>>> the correct value? Are you using causal context/vclocks correctly? (That >>>>> is, reading the object before you update, to get the correct causal >>>>> context?) Or are you relying on timestamps? (This is an ok strategy, >>>>> provided that the edits are sufficiently far apart in time, and you don't >>>>> have many concurrent edits, AND you're ok with the small risk of >>>>> occasionally the timestamp being wrong). You can use the following >>>>> strategies to prevent stale values, in increasing order of >>>>> security/preference: >>>>> >>>>> 1) Use timestamps (and not pass in vector clocks/causal context). This >>>>> is ok if you're not editing objects, or you're ok with a bit of risk of >>>>> stale values. >>>>> >>>>> 2) Use causal context correctly (which means, read-before-you-write -- >>>>> in fact, the Update operation in the java client does this for you, I >>>>> think). And if Riak can't determine which version is correct, it will fall >>>>> back on timestamps. >>>>> >>>>> 3) Turn on siblings, for that bucket or bucket type. That way, Riak >>>>> will still try to use causal context to decide the right value. But if it >>>>> can't decide, it will store BOTH values, and give them back to you on the >>>>> next read, so that your application can decide which is the correct one. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Oct 8, 2015 at 1:56 AM, Vanessa Williams < >>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>> >>>>>> Hi Dmitri, what would be the benefit of r=2, exactly? It isn't >>>>>> necessary to trigger read-repair, is it? If it's important I'd rather try >>>>>> it sooner than later... >>>>>> >>>>>> Regards, >>>>>> Vanessa >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Oct 7, 2015 at 4:02 PM, Dmitri Zagidulin < >>>>>> dzagidu...@basho.com> wrote: >>>>>> >>>>>>> Glad you sorted it out! >>>>>>> >>>>>>> (I do want to encourage you to bump your R setting to at least 2, >>>>>>> though. Run some tests -- I think you'll find that the difference in >>>>>>> speed >>>>>>> will not be noticeable, but you do get a lot more data resilience with >>>>>>> 2.) >>>>>>> >>>>>>> On Wed, Oct 7, 2015 at 6:24 PM, Vanessa Williams < >>>>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>>>> >>>>>>>> Hi Dmitri, well...we solved our problem to our satisfaction but it >>>>>>>> turned out to be something unexpected. >>>>>>>> >>>>>>>> The keys were two properties mentioned in a blog post on >>>>>>>> "configuring Riak’s oft-subtle behavioral characteristics": >>>>>>>> http://basho.com/posts/technical/riaks-config-behaviors-part-4/ >>>>>>>> >>>>>>>> notfound_ok= false >>>>>>>> basic_quorum=true >>>>>>>> >>>>>>>> The 2nd one just makes things a little faster, but the first one is >>>>>>>> the one whose default value of true was killing us. >>>>>>>> >>>>>>>> With r=1 and notfound_ok=true (default) the first node to respond, >>>>>>>> if it didn't find the requested key, the authoritative answer was >>>>>>>> "this key >>>>>>>> is not found". Not what we were expecting at all. >>>>>>>> >>>>>>>> With the changed settings, it will wait for a quorum of responses >>>>>>>> and only if *no one* finds the key will "not found" be returned. >>>>>>>> Perfect. >>>>>>>> (Without this setting it would wait for all responses, not ideal.) >>>>>>>> >>>>>>>> Now there is only one snag, which is that if the Riak node the >>>>>>>> client connects to goes down, there will be no communication and we >>>>>>>> have a >>>>>>>> problem. This is easily solvable with a load-balancer, though for >>>>>>>> complicated reasons we actually don't need to do that right now. It's >>>>>>>> just >>>>>>>> acceptable for us temporarily. Later, we'll get the load-balancer >>>>>>>> working >>>>>>>> and even that won't be a problem. >>>>>>>> >>>>>>>> I *think* we're ok now. Thanks for your help! >>>>>>>> >>>>>>>> Regards, >>>>>>>> Vanessa >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Oct 7, 2015 at 9:33 AM, Dmitri Zagidulin < >>>>>>>> dzagidu...@basho.com> wrote: >>>>>>>> >>>>>>>>> Yeah, definitely find out what the sysadmin's experience was, with >>>>>>>>> the load balancer. It could have just been a wrong configuration or >>>>>>>>> something. >>>>>>>>> >>>>>>>>> And yes, that's the documentation page I recommend - >>>>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/ >>>>>>>>> Just set up HAProxy, and point your Java clients to its IP. >>>>>>>>> >>>>>>>>> The drawbacks to load-balancing on the java client side (yes, the >>>>>>>>> cluster object) instead of a standalone load balancer like HAProxy, >>>>>>>>> are the >>>>>>>>> following: >>>>>>>>> >>>>>>>>> 1) Adding node means code changes (or at very least, config file >>>>>>>>> changes) rolled out to all your clients. Which turns out to be a >>>>>>>>> pretty >>>>>>>>> serious hassle. Instead, HAProxy allows you to add or remove nodes >>>>>>>>> without >>>>>>>>> changing any java code or config files. >>>>>>>>> >>>>>>>>> 2) Performance. We've ran many tests to compare performance, and >>>>>>>>> client-side load balancing results in significantly lower throughput >>>>>>>>> than >>>>>>>>> you'd have using haproxy (or nginx). (Specifically, you actually want >>>>>>>>> to >>>>>>>>> use the 'leastconn' load balancing algorithm with HAProxy, instead of >>>>>>>>> round >>>>>>>>> robin). >>>>>>>>> >>>>>>>>> 3) The health check on the client side (so that the java load >>>>>>>>> balancer can tell when a remote node is down) is much less >>>>>>>>> intelligent than >>>>>>>>> a dedicated load balancer would provide. With something like HAProxy, >>>>>>>>> you >>>>>>>>> should be able to take down nodes with no ill effects for the client >>>>>>>>> code. >>>>>>>>> >>>>>>>>> Now, if you load balance on the client side and you take a node >>>>>>>>> down, it's not supposed to stop working completely. (I'm not sure why >>>>>>>>> it's >>>>>>>>> failing for you, we can investigate, but it'll be easier to just use >>>>>>>>> a load >>>>>>>>> balancer). It should throw an error or two, but then start working >>>>>>>>> again >>>>>>>>> (on the retry). >>>>>>>>> >>>>>>>>> Dmitri >>>>>>>>> >>>>>>>>> On Wed, Oct 7, 2015 at 2:45 PM, Vanessa Williams < >>>>>>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>>>>>> >>>>>>>>>> Hi Dmitri, thanks for the quick reply. >>>>>>>>>> >>>>>>>>>> It was actually our sysadmin who tried the load balancer approach >>>>>>>>>> and had no success, late last evening. However I haven't discussed >>>>>>>>>> the gory >>>>>>>>>> details with him yet. The failure he saw was at the application >>>>>>>>>> level (i.e. >>>>>>>>>> failure to read a key), but I don't know a) how he set up the LB or >>>>>>>>>> b) what >>>>>>>>>> the Java exception was, if any. I'll find that out in an hour or two >>>>>>>>>> and >>>>>>>>>> report back. >>>>>>>>>> >>>>>>>>>> I did find this article just now: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/ >>>>>>>>>> >>>>>>>>>> So I suppose we'll give those suggestions a try this morning. >>>>>>>>>> >>>>>>>>>> What is the drawback to having the client connect to all 4 nodes >>>>>>>>>> (the cluster client, I assume you mean?) My understanding from >>>>>>>>>> reading >>>>>>>>>> articles I've found is that one of the nodes going away causes that >>>>>>>>>> client >>>>>>>>>> to fail as well. Is that what you mean, or are there other drawbacks >>>>>>>>>> as >>>>>>>>>> well? >>>>>>>>>> >>>>>>>>>> If there's anything else you can recommend, or links other than >>>>>>>>>> the one above you can point me to, it would be much appreciated. We >>>>>>>>>> expect >>>>>>>>>> both node failure and deliberate node removal for upgrade, repair, >>>>>>>>>> replacement, etc. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Vanessa >>>>>>>>>> >>>>>>>>>> On Wed, Oct 7, 2015 at 8:29 AM, Dmitri Zagidulin < >>>>>>>>>> dzagidu...@basho.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Vanessa, >>>>>>>>>>> >>>>>>>>>>> Riak is definitely meant to run behind a load balancer. (Or, at >>>>>>>>>>> the worst case, to be load-balanced on the client side. That is, all >>>>>>>>>>> clients connect to all 4 nodes). >>>>>>>>>>> >>>>>>>>>>> When you say "we did try putting all 4 Riak nodes behind a >>>>>>>>>>> load-balancer and pointing the clients at it, but it didn't help." >>>>>>>>>>> -- what >>>>>>>>>>> do you mean exactly, by "it didn't help"? What happened when you >>>>>>>>>>> tried >>>>>>>>>>> using the load balancer? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Oct 7, 2015 at 1:57 PM, Vanessa Williams < >>>>>>>>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, we are still (for a while longer) using Riak 1.4 and >>>>>>>>>>>> the matching Java client. The client(s) connect to one node in the >>>>>>>>>>>> cluster >>>>>>>>>>>> (since that's all it can do in this client version). The cluster >>>>>>>>>>>> itself has >>>>>>>>>>>> 4 nodes (sorry, we can't use 5 in this scenario). There are 2 >>>>>>>>>>>> separate >>>>>>>>>>>> clients. >>>>>>>>>>>> >>>>>>>>>>>> We've tried both n_val = 3 and n_val=4. We achieve >>>>>>>>>>>> consistency-by-writes by setting w=all. Therefore, we only require >>>>>>>>>>>> one >>>>>>>>>>>> successful read (r=1). >>>>>>>>>>>> >>>>>>>>>>>> When all nodes are up, everything is fine. If one node fails, >>>>>>>>>>>> the clients can no longer read any keys at all. There's an >>>>>>>>>>>> exception like >>>>>>>>>>>> this: >>>>>>>>>>>> >>>>>>>>>>>> com.basho.riak.client.RiakRetryFailedException: >>>>>>>>>>>> java.net.ConnectException: Connection refused >>>>>>>>>>>> >>>>>>>>>>>> Now, it isn't possible that Riak can't operate when one node >>>>>>>>>>>> fails, so we're clearly missing something here. >>>>>>>>>>>> >>>>>>>>>>>> Note: we did try putting all 4 Riak nodes behind a >>>>>>>>>>>> load-balancer and pointing the clients at it, but it didn't help. >>>>>>>>>>>> >>>>>>>>>>>> Riak is a high-availability key-value store, so... why are we >>>>>>>>>>>> failing to achieve high-availability? Any suggestions greatly >>>>>>>>>>>> appreciated, >>>>>>>>>>>> and if more info is required I'll do my best to provide it. >>>>>>>>>>>> >>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>> Vanessa >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Vanessa Williams >>>>>>>>>>>> ThoughtWire Corporation >>>>>>>>>>>> http://www.thoughtwire.com >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> riak-users mailing list >>>>>>>>>>>> riak-users@lists.basho.com >>>>>>>>>>>> >>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> riak-users@lists.basho.com >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com