Hi,
Any suggestions how to do that? Let's say I have several part-NNNN
MapFile-s created by MapFileOutputFormat using a specified Comparator
and Partitioner. How can I traverse the data in strictly ascending
global key order (i.e. across all parts)?
The best that comes to my mind is the following pseudo-code:
get the readers;
get the first keys from all readers, and put them on a sorted list;
do {
remove the smallest key, and retrieve value from its reader;
add next key from the same reader:
if it's smaller than other keys, continue;
if the list is empty, read next values from all readers;
} while (more keys from any reader);
Any other suggestions?
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com