Hello,

I'm benchmarking various noSQL databases (see www.nosqlbenchmarking.com for
current results and configurations used) for my master's thesis and I'm
going to apply this benchmark on bigger clusters. Indeed for the moment I
have only used a small cluster of 8 servers with a very small data set
(20000 articles from Wikipedia) to conduct those tests.

I will use up to 100 servers (2Gb, 4 CPU, 80Gb hdd) from the Rackspace cloud
and the new data set is the entire English version of Wikipedia. Each
article is store as a single document with a unique ID based on a integer,
you can see the implementation here :
https://github.com/toflames/Wikipedia-noSQL-Benchmark/blob/master/src/implementations/riakDB.java
and
the benchmark methodology here :
http://www.slideshare.net/ThibaultDory/a-new-methodology-for-large

I would like to know if some of you have advice on how I could take the best
out of Riak for this specific use case and on this kind of server. For
example I would like to know if there are some memory/cache tunings that
could be useful to match this server size.

Any other input or critic is welcome,

Thank you,


Thibault Dory
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to