There is a special class for storing maps in HDFS.

Look at MapFile.

Also, any mapper can contact an outside resource such as a database.  This
is, however, very bad practice since the load on the outside resource can
skyrocket as your cluster grows.  If the cost of the request is small
relative to the work of the map, then this might be kind-of sort-of OK, but
maps are often very cheap to do which means that any dependence on a single
external resource can be really bad for throughput.


On 7/16/07 6:19 AM, "novice user" <[EMAIL PROTECTED]> wrote:

> 
> Hi All,
>  I am new to hadoop and learning how to use it. I have a problem which can
> be solvable using map-reduce technique. But, in my map step, I need to
> consider some extra information which depends on  the input key,value pair.
> Can some one please help me what is the good way of taking this data? I am
> thinking of storing it in some HDFS and map code try to load it whenever it
> is processing a particular key, value pair. any other better approaches,
> please let me know.
> 
> 
> Thanks in advance

Reply via email to