Bwolen Yang wrote:
Given that map/reduce produces a partitioned set of sorted output
files, I was wondering if a map implementation exists for doing
lookups or iterate thru subranges of these files.
This would be similar to the java SortedMap, except
- the map is read-only,
- works with data on disk (instead of in memory),
- for lookups, it should know to tradeoff seeks (with binary search)
vs disk read
http://lucene.apache.org/hadoop/api/org/apache/hadoop/io/MapFile.html
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/MapFileOutputFormat.html
Doug