We use MalStone and TeraSort. For Hive, you can use TPC-H, at least the data 
and the queries, if not the query generator. There is a Jira issue in Hive that 
discusses the TPC-H "benchmark" if you're interested. Sorry, I don't remember 
the issue number offhand.

-----Original Message-----
From: Shrinivas Joshi [mailto:[email protected]] 
Sent: Friday, February 18, 2011 3:32 PM
To: [email protected]
Subject: benchmark choices

Which workloads are used for serious benchmarking of Hadoop clusters? Do you 
care about any of the following workloads :
TeraSort, GridMix v1, v2, or v3, MalStone, CloudBurst, MRBench, NNBench, sample 
apps shipped with Hadoop distro like PiEstimator, dbcount etc.

Thanks,
-Shrinivas

Reply via email to