Uruka, Now that you got some somewhat reasonable numbers, it is probably time to discuss what you are trying to get out of Riak. We typically recommend 4 or 5 nodes minimum for a Riak install because that is the point where the distribution becomes a performance benefit rather than a hindrance. I know you were just load testing, but I'd recommend considering a test with 4 or 5 nodes, with default N values. During the test, remove a node (power it off, or 'riak stop' it). Or like someone else mentioned start with a 3 or 4 node cluster and add a node to see how the performance goes up and no further operations work is needed to rebalance the data around the cluster. This is really where Riak shines over some alternative databases, the ease of scaling and dealing with failures. SIngle node performance although fun to try and tune to get the most out of it, isn't as interesting on a long timeline when trying to scale the system. Obviously single node performance is still important, dont' get me wrong. Riak isn't always the best choice, but when it comes with staying available and performance while systems are failing no other system has a better real-world story than Riak.
If you still want to get your single node performance up, we have several pages on our docs page based around tuning. A good place to start is the file system tuning page http://docs.basho.com/riak/latest/cookbooks/File-System-Tuning/ . Reading that and other pages in the Operations section might be helpful in squeezing out those last bits of speed. I am glad to see your initial 60 writes/sec has gone up to 800 writes/sec, but we definitely can do better once you start utilizing our strengths. Hope my rambling helped, -Jared On Sat, Nov 3, 2012 at 4:55 AM, Uruka Dark <[email protected]> wrote: > Jared, > > Thank you for you time and reply. > > I got impressed by your numbers and I started to double check my settings. > I found a big problem here, my third machine (the one out of the cluster, > making the load), was not talking to Riak in gigabit speed, it was 100 Mbs. > I changed the network cable and it's working fine now. > I ran my python script again and I already could see better results: 252 > ops/sec (before the fix it was 175 ops/sec). > > I also ran your benchmark .config, and these are my numbers: > https://dl.dropbox.com/u/308392/summary.png > > As you can see, even so, I'm still far from your results.. not even close, > and now I'm using Bitcask. > Anyway, my current position is much better than at the beggining. I'll > double-check all over again, cause now I have a confirmation that there is > something wrong. > > If you have any suggestion, please, let me know. > Once again, thank you. > > On Sat, Nov 3, 2012 at 3:08 AM, Jared Morrow <[email protected]> wrote: > >> I forgot to mention that 2000 ops/sec was on bitcask, not memory. I >> didn't bother with the memory backend. >> >> -Jared >> >> >> On Sat, Nov 3, 2012 at 12:05 AM, Jared Morrow <[email protected]> wrote: >> >>> Uruka, >>> >>> So looking at your results something is really wrong with your setup. I >>> was surprised by your numbers, so I made two VM's each with only 1gb of RAM >>> on two different boxes also on a 1gb switch. >>> >>> I ran a put of 100,000 keys at 10kb in size. >>> >>> I didn't do any tuning at all on the VM's and these were quick Ubuntu >>> 10.04 VM's with 2 virtual CPU's and 1 gig of ram. I also didn't change any >>> settings in Riak, except for the IP address and listening ports. >>> >>> Here is the summary of the results showing around 2000 ops/sec >>> https://dl.dropbox.com/u/183971/summary.png >>> >>> So my main thought is that you weren't actually using N=1 for your puts >>> and you were using the default N value of 3, meaning you were writing each >>> key/value 3 times, and with 2 nodes this is doing a lot of writes to the >>> same disk multiple times. >>> >>> To be sure you have N=1, you can use 'riak attach' on each node and >>> enter the following command: >>> >>> riak_core_bucket:set_bucket(<<"pop1">>,[{n_val,1}]). >>> >>> >>> If you bucket name is "pop1" as in my case. That name is completely >>> arbitrary. >>> >>> Sorry I'm late to this thread, I had to find some time to setup the test. >>> >>> For reference I used https://github.com/basho/basho_bench for the >>> benchmark. With the following .config file >>> https://gist.github.com/e630b63f4a025a0fb634 >>> >>> Hope this helps, >>> Jared >>> >>> >> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
