It depends. What data is going into the table, and what keys will drive the lookup?
Let's suppose that you have a single JSON file that has some reasonable number of key/value tuples. You could easily load a Hashtable to associate the integer keys with the values (which appear to be lists of integers). Each task in your MapReduce could process each input tuple, doing a lookup by key and appending values to the output records, and that is a perfectly fine thing to do in MapReduce. In this model, the JSON file is effectively a constant singleton table for the entire MapReduce job. You can just load it from HDFS or any file system. Specifying it as a cached file may improve performance somewhat. If you explain your intent we might be able to help better. john From: jamal sasha [mailto:[email protected]] Sent: Monday, January 07, 2013 4:21 PM To: [email protected] Subject: Binary Search in map reduce Hi, I have data in json format like: {key:[values.....]} key, values are longints. Now, I want to do a fast lookup of a key. How would I implement a binary search in map reduce abstraction. Or am i not thinking about this correctly? Any suggestions/advices? Thanks
