Re: Achieving 100% consitency

Russell Brown Mon, 29 Aug 2011 06:47:43 -0700

I'm having trouble simply posting a response to this list. This is my 3rd 
attempt so if anyone is being spammed, I'm sorry, but I'm just not seeing the 
replies and I keep getting rejection notices from the list manager…


Here we go again:

Hi Lukas,

Sorry it has taken me so long to join this party, I have been away and I'm just 
catching up.

On 29 Aug 2011, at 09:05, Lukas Schulze wrote:

> A friend of mine found out how it could work: I have to delete the entry 
> first and after storing it in the database I've to check the result.
> It's not the prettiest code I've written before, but without the while-loop 
> it will work for nearly 97% of my tuples. With the while-loop everything 
> works fine.
> 
> ======================================
> RiakObject riakIndex = new RiakObject(attrName, attrValue, 
> indexString.getBytes());
> riakIndex.setContentType("text/plain; charset=UTF-8");
> try{
>       //have to be deleted because of cache (?)

No, there is no cache.

>       riakPBCClient.delete(attrName, attrValue);
>       riakPBCClient.store(riakIndex);
>       RiakObject[] fetched = riakPBCClient.fetch(attrName, attrValue);
>       //check whether the entry is correctly stored in the database
>       while(fetched.length == 0) {//try it until it works...
>               riakPBCClient.store(riakIndex);
>               fetched = riakPBCClient.fetch(attrName, attrValue);
>       }
>       

I don't understand what you're doing here? If you want the object you stored, 
why not just returnBody=true for the store operation?

>       //fetched entry doesn't match our stored one
>       if(!riakIndex.getValue().equals(fetched[0].getValue())) {
>               System.err.println("index match: failed -> " + attrName + "." + 
> attrValue);
>       }

You really need to do this with n_val, r, w, dw all = 1 and allow_mult=false 
and no other clients? Something is wrong. Please let me know what version of 
the RJC you are using and what version of Riak on what erlang/OS/arch etc. I'd 
like to try this at home as storing data and retrieving is what we are all 
about and this should *just work* (we have integration tests like this and they 
*DO* work.) 

Forgive me if I am teaching you to suck eggs here, but right at the start 
there, when you store the "riakIndex" object, is there a chance it already 
exists? 
Does the bucket have allow_mult set to false? 
Do you fetch before your store (if you use the new RJC it does a fetch before a 
store to handle the vector clock for you)? 
Are you attempting to overwrite an existing value but providing a stale vclock? 
If you have allow_mult set to false and try and write your object with a stale 
vclock your write will be silently dropped. Likewise, if you provide *no* 
vclock but a clientId that already has a write on that key, your write will be 
dropped. I realise this can be confusing at first, but does it explain the 
behaviour you're seeing?

For fun, try this:

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(404)

curl -v -X PUT http://127.0.0.1:8098/riak/b/k -d"a"

(204)

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(200) a

curl -v -X PUT http://127.0.0.1:8098/riak/b/k -d"b"

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(200) b

BUT do the same thing with a clientId header set

curl -v -X GET http://127.0.0.1:8098/riak/c/k -H"X-Riak-ClientId: pete"

(404)

curl -v -X PUT http://127.0.0.1:8098/riak/c/k -d"a" -H"X-Riak-ClientId: pete"

(204)

curl -v -X GET http://127.0.0.1:8098/riak/c/k -H"X-Riak-ClientId: pete"

(200) a

curl -v -X PUT http://127.0.0.1:8098/riak/c/k -d"b" -H"X-Riak-ClientId: pete"

curl -v -X GET http://127.0.0.1:8098/riak/c/k

(200) a (HUH?)

Does that explain any of what you see?

A well behaved client will always fetch before store, and will use the 
vlcock/clientId. The new RJC does this for you behind the scenes. See 
http://blog.basho.com/2011/07/14/The-All-New-Riak-Java-Client/ and 
https://github.com/basho/riak-java-client/blob/master/README.org for more 
details.

What is your use case? When you go into production will you have multiple 
nodes? Will you have allow_mult set to true? Let me know what I can do to help, 
'cos it is great to see someone else using the Java client and I want to make 
it easy for you to do so. 

It would be far simpler to start defining you strategy for resolving 
conflicting writes/sibling values than it would be to try and acheive 100% 
consistency in a distributed, fault tolerant database.

Cheers

Russell

>       
> }
> ======================================
> 
> Best regards
> Lukas
> 
> 
> On Mon, Aug 29, 2011 at 9:20 AM, Lukas Schulze <[email protected]> wrote:
> Hi,
> 
> thank you for your answers.
> I know that Riak is designed for running on distributed servers.
> But what's about adding lots of data and every tuple depends on another one?
> I thought that having only 1 node and disabling replications could solve my 
> problems of getting always the latest data from Riak.
> 
> Is there another way to achieve 100% consistency in a riak database after a 
> very short time?
> 
> Best regards
> Lukas
> 
> 
> 
> On Sat, Aug 27, 2011 at 5:43 PM, Ian Plosker <[email protected]> wrote:
> Jonathan,
> 
> Excuse me, that last message should have been addressed to you.
> 
> Ian Plosker
> Developer Advocate
> Basho Technologies
> 
> 
> On Aug 27, 2011, at 11:39 AM, Ian Plosker wrote:
> 
>> Lukas,
>> 
>> Yes, even for dev you'd be best advised to develop and test your application 
>> with the same or similar number of nodes and n, r, and w settings as you 
>> would in production. It's good practice to develop applications in a 
>> dev/test environment that mirrors the production environment as much as is 
>> reasonable/feasible. You can run a single node cluster, but note that this 
>> isn't a configuration you'll see in a production.
>> 
>> Ian Plosker
>> Developer Advocate
>> Basho Technologies
>> 
>> 
>> 
>> On Aug 27, 2011, at 5:33 AM, Jonathan Langevin wrote:
>> 
>>> Even for development-purposes only? Otherwise it seems data would be 
>>> written n times to the same machine, which is needless in a dev environment 
>>> with low storage specs...
>>> 
>>> 
>>> Jonathan Langevin
>>> Systems Administrator
>>> Loom Inc.
>>> Wilmington, NC: (910) 241-0433 - [email protected] - 
>>> www.loomlearning.com - Skype: intel352
>>> 
>>> 
>>> 
>>> On Fri, Aug 26, 2011 at 5:01 PM, Ian Plosker <[email protected]> wrote:
>>> Lukas,
>>> 
>>> Also, we don't advise that you run single node clusters. Riak is designed 
>>> to be used in clusters of at least 3 nodes. You can run a multi-node 
>>> cluster on a single development machine by downloading the Riak source, and 
>>> running "make devrel". Take a look at the Riak Fast Track 
>>> (http://wiki.basho.com/The-Riak-Fast-Track.html) for more details.
>>> 
>>> Ian Plosker
>>> Developer Advocate
>>> Basho Technologies
>>> 
>>> On Aug 26, 2011, at 3:17 PM, Lukas Schulze wrote:
>>> 
>>>> I'm doing some simple tests with Riak and tried to build something like an 
>>>> index.
>>>> Therefore I created new buckets for some attributes like "name", "street" 
>>>> and "city".
>>>> One entry in the index-bucket "name" is for example "Mueller" and the 
>>>> value contains all user ids, formatted as an JSON string: 
>>>> "{id:[1,5,8,13,2,7]}"
>>>> The java objects are saved as JSON strings in a separate bucket "users", 
>>>> the keys in this bucket are the user-ids, the values are the JSON strings.
>>>> 
>>>> If I add 200 users via Java and the RiakPBC client every loop I fetch the 
>>>> index, add the new user id and store it again in Riak.
>>>> But java is too fast, so I receive an old version of the bucket.
>>>> 
>>>> Because I've only one node I set the n-value to 1, r = 1, w = 1 and dw = 1.
>>>> But I have to wait nearly 2 seconds to be mostly sure to get the correct 
>>>> response. (the computer isn't an high-end machine ;-) )
>>>> 
>>>> Is it possible to be sure that the data will be saved permanently and I 
>>>> can continue adding users?
>>>> Are there any caching methods I can configure?
>>>> Can I set the default n-value to 1 so that every newly created bucket will 
>>>> have this value?
>>>> Does Riak have any kind of indexes or is it possible to implement it a 
>>>> better way?
>>>> 
>>>> In my first version I saved all users in one bucket and iterated over all 
>>>> of them to find the correct one. But for every single request from the 
>>>> Java Service to Riak it took nearly 200ms. For a huge amount of entries 
>>>> (10,000) this isn't practible. Therefore I tried to implement my own 
>>>> indexes.
>>>> 
>>>> The main focus of my question is getting rid of the inconsistent reads.
>>>> 
>>>> Thank you.
>>>> 
>>>> Best Regards
>>>> Lukas
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [email protected]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Achieving 100% consitency

Reply via email to