Hello, I'm benchmarking various noSQL databases (see www.nosqlbenchmarking.com for current results and configurations used) for my master's thesis and I'm going to apply this benchmark on bigger clusters. Indeed for the moment I have only used a small cluster of 8 servers with a very small data set (20000 articles from Wikipedia) to conduct those tests.
I will use up to 100 servers (2Gb, 4 CPU, 80Gb hdd) from the Rackspace cloud and the new data set is the entire English version of Wikipedia. Each article is store as a single document with a unique ID based on a integer, you can see the implementation here : https://github.com/toflames/Wikipedia-noSQL-Benchmark/blob/master/src/implementations/riakDB.java and the benchmark methodology here : http://www.slideshare.net/ThibaultDory/a-new-methodology-for-large I would like to know if some of you have advice on how I could take the best out of Riak for this specific use case and on this kind of server. For example I would like to know if there are some memory/cache tunings that could be useful to match this server size. Any other input or critic is welcome, Thank you, Thibault Dory
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
