There are a few things that you can do to speed up your load. When you're 
writing your data, you can set both W and DW to 0 (as long as you have a way to 
check for errors). This will shave a bit of time off of each write because 
you'll be throwing writes against the database and hoping that they stick. You 
can also set the returnbody to false. Returnbody defaults to true IIRC. When 
returnbody enabled, Riak will return the object you wrote and also include the 
Riak specific info (vclock, etc). I don't care about these things when I'm 
doing a bulk load, so I turn that sort of thing off.

Depending on the type of querying you're doing, you can adjust the JavaScript 
VM settings. For example, if you aren't doing any reduce phases in your 
queries, then you can set the number of reduce VMs to 0. Since you're probably 
only doing key lookups, you can probably kill off all of the JavaScript VMs.

I suspect somebody smarter will have better input and will correct me, but 
that's my 2 cents worth. 
-- 
Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA
On Tuesday, March 8, 2011 at 8:34 AM, Thibault Dory wrote: 
> Hello,
> 
> I'm benchmarking various noSQL databases (see www.nosqlbenchmarking.com for 
> current results and configurations used) for my master's thesis and I'm going 
> to apply this benchmark on bigger clusters. Indeed for the moment I have only 
> used a small cluster of 8 servers with a very small data set (20000 articles 
> from Wikipedia) to conduct those tests. 
> 
> I will use up to 100 servers (2Gb, 4 CPU, 80Gb hdd) from the Rackspace cloud 
> and the new data set is the entire English version of Wikipedia. Each article 
> is store as a single document with a unique ID based on a integer, you can 
> see the implementation here : 
> https://github.com/toflames/Wikipedia-noSQL-Benchmark/blob/master/src/implementations/riakDB.java
>  and the benchmark methodology here : 
> http://www.slideshare.net/ThibaultDory/a-new-methodology-for-large 
> 
> I would like to know if some of you have advice on how I could take the best 
> out of Riak for this specific use case and on this kind of server. For 
> example I would like to know if there are some memory/cache tunings that 
> could be useful to match this server size. 
> 
> Any other input or critic is welcome,
> 
> Thank you,
> 
> 
> Thibault Dory 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to