Hello everybody, I am trying to benchmark a Hadoop-cluster with regards to throughput of pipelined MapReduce jobs. Looking for benchmarks, I found the "Gridmix" benchmark that is supplied with Hadoop. In its README-file it says that part of this benchmark is a "Three stage map/reduce job".
As this seems to match my needs, I was wondering if it possible to configure "Gridmix", in order to only run this job (without the rest of the "Gridmix" benchmark)? Or do I have to build my own benchmark? If this is the case, which classes are used by this "Three stage map/reduce job"? Thanks for any help! David