Re: benchmark choices

Konstantin Boudnik Fri, 18 Feb 2011 14:51:02 -0800

On Fri, Feb 18, 2011 at 14:35, Ted Dunning <[email protected]> wrote:
> I just read the malstone report.  They report times for a Java version that
> is many (5x) times slower than for a streaming implementation.  That single
> fact indicates that the Java code is so appallingly bad that this is a very
> bad benchmark.


Slow Java code? That's funny ;) Running with Hotspot on by any chance?

> On Fri, Feb 18, 2011 at 2:27 PM, Jim Falgout <[email protected]>wrote:
>
>> We use MalStone and TeraSort. For Hive, you can use TPC-H, at least the
>> data and the queries, if not the query generator. There is a Jira issue in
>> Hive that discusses the TPC-H "benchmark" if you're interested. Sorry, I
>> don't remember the issue number offhand.
>>
>> -----Original Message-----
>> From: Shrinivas Joshi [mailto:[email protected]]
>> Sent: Friday, February 18, 2011 3:32 PM
>> To: [email protected]
>> Subject: benchmark choices
>>
>> Which workloads are used for serious benchmarking of Hadoop clusters? Do
>> you care about any of the following workloads :
>> TeraSort, GridMix v1, v2, or v3, MalStone, CloudBurst, MRBench, NNBench,
>> sample apps shipped with Hadoop distro like PiEstimator, dbcount etc.
>>
>> Thanks,
>> -Shrinivas
>>
>>
>

Re: benchmark choices

Reply via email to