Hi Owen,

Can you provide more details of your test?  In particular what was the Java
Map-reduce program that your ran?  Was it
src/examples/org/apache/hadoop/examples/Sort.java ?  Also, I can't find
anything called "RandomTextWriter" in the source tarball, can you point me
to it?  Thanks.

- Doug

On Nov 8, 2007 5:03 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote:

> I set up a little benchmark on a 39 node cluster to sort 40gb of
> random text data (generated by RandomTextWriter using key length:
> 1-10 words and value length: 0-200 words, data uncompressed). The
> runtimes in minutes are:
>
> Java:                   4:22
> C++ (Pipes):            3:50
> Streaming:              4:44
>
> I was surprised to find that Pipes out performed Java, even with the
> extra process. I suspect it was because of the buffering between the
> input and output of Pipes.
>
> -- Owen
>

Reply via email to