Doesn't the sorting and merging all still happen in Java-land?

-----Original Message-----
From: Owen O'Malley [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 08, 2007 5:03 PM
To: [email protected]
Subject: sort speeds under java, c++, and streaming

I set up a little benchmark on a 39 node cluster to sort 40gb of  
random text data (generated by RandomTextWriter using key length:  
1-10 words and value length: 0-200 words, data uncompressed). The  
runtimes in minutes are:

Java:                   4:22
C++ (Pipes):            3:50
Streaming:              4:44

I was surprised to find that Pipes out performed Java, even with the  
extra process. I suspect it was because of the buffering between the  
input and output of Pipes.

-- Owen

Reply via email to