[HADOOP] Terasort for numbers

Teodor Macicas Sun, 01 Aug 2010 14:24:14 -0700

Hi all,

I am using hadoop 0.20.2 and I want to use sort huge amount of data.I've read about Terasort [from examples], but now it's using 10byteschar keys.Changing keys from char to integer wasn't a good solution as Terasortbuilds a trie for creating total order partitions. I got stuck when Itried to change the char trie to a one suitable for number keys.

Then, I've given a try to Sort [also from examples] and it did work forinteger keys, but without a total order partitioning. In the end of theday, the final result can not be created only by putting together allreducers' outputs. Each reducer sorts only a subset of data and nomerging is occured between two reducers.

Please can anyone advise me what and how to use in order to sort hugeamount of real numbers ?

Looking forward for your replies.


Thank you.
Best,
Teodor

[HADOOP] Terasort for numbers

Reply via email to