Hello all I've got a small hadoop cluster running (5 nodes today, going to 15+ soon), and I'd like to do some benchmarking. My question to the group is - what is the first benchmark you run on a new cluster?
I'd like to do some simple functionality, throughput, and, especially, scaling experiments. So far, the programs in the examples jar (grep, wordcount, etc.) run fine. I've had less success with the programs in the test jar (DFSCIOTest, DistributedFSCheck, etc.). Are some of these deprecated? Some have very similar names - are there significant differences between them? Thanks! -steve
