RE: Understanding read_repairs

Belai Beshah Mon, 25 Feb 2013 13:58:55 -0800

Thanks Jared for the detailed instruction. I was able to build master and patch 
1.3 on my build VM. I will upgrade our tests cluster with the patches and 
report back on how our testing went.

________________________________
From: Jared Morrow [[email protected]]
Sent: Friday, February 22, 2013 11:56 AM
To: Belai Beshah
Cc: Russell Brown; [email protected]
Subject: Re: Understanding read_repairs

Belai,

One other option is to use our "basho-patches" functionality. We use it to run 
new code on current installations where sending a new .beam file is easier than 
remaking the packages or compiling from source. On your ubuntu system using our 
packages, the folder should be in /usr/lib/riak/lib/basho-patches.

To do this you just need the one file changed in the PR pointed to by Russell.

Here are the steps to make that happen:

 *   Install Erlang R15B01: 
http://docs.basho.com/riak/latest/tutorials/installation/Installing-Erlang/
 *   Get riak_kv: git clone 
git://github.com/basho/riak_kv.git<http://github.com/basho/riak_kv.git>
 *   compile riak_kv with just 'make'
 *   copy the resulting .beam file in the ebin folder to the machines you need 
the new file: scp ebin/riak_kv_vnode.beam 
user@myriaknode:/usr/lib/riak/lib/basho-patches
 *   stop each node and restart them one at a time
 *   If you want to convince yourself you are using the new code, you can do a 
'riak attach' to attach to the node and run code:which('riak_kv_vnode'). (Don't 
forget the '.' at the end)

For example on my dev install here is the command before the file is in 
basho-patches:

([email protected]<mailto:[email protected]>)1> code:which('riak_kv_vnode').
".../lib/riak_kv-1.3.0/ebin/riak_kv_vnode.beam"

Here is the command after I put the .beam in the basho-patches directory:

([email protected]<mailto:[email protected]>)1> code:which('riak_kv_vnode').
".../lib/basho-patches/riak_kv_vnode.beam"

Notice the path of the code changed from .../riak_kv-1.3.0/... to 
.../basho-patches/...

That might seem like a lot of work, but it is a really handy and useful 
trick/skill that you might use quite a bit down the road.

Hope that helps,
Jared

On Fri, Feb 22, 2013 at 10:25 AM, Belai Beshah 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Russel, that looks like exactly the problem we saw. I have never built 
riak from source before but I will give it a try it this weekend.

________________________________________
From: Russell Brown [[email protected]<mailto:[email protected]>]
Sent: Friday, February 22, 2013 1:24 AM
To: Belai Beshah
Cc: [email protected]<mailto:[email protected]>
Subject: Re: Understanding read_repairs

Hi,
Thanks for trying Riak.

On 21 Feb 2013, at 23:48, Belai Beshah 
<[email protected]<mailto:[email protected]>> wrote:

> Hi All,
>
> We are evaluating Riak to see if it can be used to cache large blobs of data. 
> Here is our test cluster setup:
>
>       • six Ubuntu LTS 12.04 dedicated nodes with 8 core 2.6 Ghz CPU, 32 GB 
> RAM, 3.6T disk
>       • {pb_backlog, 64},
>       • {ring_creation_size, 256},
>       • {default_bucket_props, [{n_val, 2}, 
> {allow_mult,false},{last_write_wins,true}]},
>       • using bitcask as the backend
>
> Everything else default except the above. There is an HAProxy load balancer 
> infront of the nodes that the clients talk too configured according to the 
> basho wiki. Due to the nature of the application we are integrating we do 
> about 1200/s writes of approximately 40-50KB each and read them back almost 
> immediately. We noticed a lot of read repairs and since that was one of the 
> things that could indicate performance problem we go worried. So we wrote a 
> simple java client application that simulates our use case. The test program 
> is dead simple:
>       • generate keys using random UUID and value using Apache commons 
> RandomStringUtils
>       • create a thread pool of 5 and store key/value using “bucket.store()”
>       • read the values back using “bucket.fetch()” multiple times
> I could provide the spike code if needed. What we noticed is that we get a 
> lot of read repairs all over the place. We even made it use a single thread 
> to read/write, played with the write/read quorum and even put a delay of 5 
> minutes between the writes before the reads start to give the cluster time to 
> be eventually consistent. Nothing helps, we always see a lot of read repairs, 
> sometime as many as the number of inserts.

It sounds like you are experiencing this bug 
https://github.com/basho/riak_kv/pull/334

It is fixed in master, but it doesn't look like it made it into 1.3.0. If 
you're ok with building from source, I tried it and a patch from 
8895d2877576af2441bee755028df1a6cf2174c7 goes cleanly onto 1.3.0.

Cheers

Russell

> The good thing is that in all of these tests we have not seen any read 
> failures. Performance is also not bad, a few maxs here and there we don't 
> like but 90% looks good. Even when we killed a node, the reads are still 
> successful.
>
> We are wondering what the expected ratio of read repairs is and what is a 
> reasonable time for the cluster not to restore to read_repair to fulfill a 
> read request or is there something we are missing in our setup.
>
> Thanks
> Belai
> _______________________________________________
> riak-users mailing list
> [email protected]<mailto:[email protected]>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

RE: Understanding read_repairs

Reply via email to