On Apr 23, 2007, at 7:39 AM, Steve Schlosser wrote:
I've got a small hadoop cluster running (5 nodes today, going to 15+ soon), and I'd like to do some benchmarking. My question to the group is - what is the first benchmark you run on a new cluster?
I usually use random-writer to generate some random data (it defaults to 10g/node) and then use sort to sort it. Sort provides a pretty decent simple testcase for moving a lot of data through map/reduce.
-- Owen
