Hi all-- I would like to have a reducer generate a MapFile so that in later processes I can look up the values associated with a few keys without processing an entire sequence file. However, if I have N reducers, I will generate N different map files, so to pick the right map file I will need to use the same partitioner as was used when partitioning the keys to reducers (the reducer I have running emits one value for each key it receives and no others). Should this be done manually, ie something like readers[partioner.getPartition(...)] or is there another recommended method?
Eventually, I'm going to migrate to using HBase to store the key/value pairs (since I'd to take advantage of HBase's ability to cache common pairs in memory for faster retrieval), but I'm interested in seeing what the performance is like just using MapFiles. Thanks, Chris
