Neat benchmark. I've been meaning to do exactly that myself. And that is a surprise about Pipes!

Thanks for the data
- Aaron

Owen O'Malley wrote:
I set up a little benchmark on a 39 node cluster to sort 40gb of random text data (generated by RandomTextWriter using key length: 1-10 words and value length: 0-200 words, data uncompressed). The runtimes in minutes are:

Java:            4:22
C++ (Pipes):        3:50
Streaming:        4:44

I was surprised to find that Pipes out performed Java, even with the extra process. I suspect it was because of the buffering between the input and output of Pipes.

-- Owen

Reply via email to