On Apr 23, 2007, at 7:39 AM, Steve Schlosser wrote:

I've got a small hadoop cluster running (5 nodes today, going to 15+
soon), and I'd like to do some benchmarking.  My question to the group
is - what is the first benchmark you run on a new cluster?

I usually use random-writer to generate some random data (it defaults to 10g/node) and then use sort to sort it. Sort provides a pretty decent simple testcase for moving a lot of data through map/reduce.

-- Owen

Reply via email to