Bwolen Yang wrote:
Given that map/reduce produces a partitioned set of sorted output
files, I was wondering if a map implementation exists for doing
lookups or iterate thru subranges of these files.

This would be similar to the java SortedMap, except
 - the map is read-only,
 - works with data on disk (instead of in memory),
 - for lookups, it should know to tradeoff seeks (with binary search)
vs disk read

http://lucene.apache.org/hadoop/api/org/apache/hadoop/io/MapFile.html
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/MapFileOutputFormat.html

Doug

Reply via email to