Hi,

Any suggestions how to do that? Let's say I have several part-NNNN MapFile-s created by MapFileOutputFormat using a specified Comparator and Partitioner. How can I traverse the data in strictly ascending global key order (i.e. across all parts)?

The best that comes to my mind is the following pseudo-code:

get the readers;
get the first keys from all readers, and put them on a sorted list;
do {
        remove the smallest key, and retrieve value from its reader;
        add next key from the same reader:
                if it's smaller than other keys, continue;
        if the list is empty, read next values from all readers;
} while (more keys from any reader);

Any other suggestions?

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to