Hi Rakhi,

I had the same requirement once, where i had to share data (read/write)
among different task (map/reduce).
DistributedCache and JobConf object only store read only data.
So I used two approaches:

1. I used Memcached with hadoop, so that i can store (read/write) data on
memcached server. I can refer this data
    from any task by just connecting the memcached server.

2. I used Tokyocabinet (a BDB like file based database) with hadoop. I could
store and fetch data in it anytime
    and from any task.

On Thu, Oct 1, 2009 at 5:42 AM, Jakob Homan <jho...@yahoo-inc.com> wrote:

> Raakhi-
>   Guilherme is correct. Each mapper (and reducer) runs independently and
> communication between them is not provided for nor encouraged.  You may wish
> to look into the DistributedCached (http://wiki.apache.org/hadoop/FAQ#A8,
> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#DistributedCache)
> for providing data that are available to all the tasks.
>
> Jakob
> Hadoop at Yahoo!
>
>
> Rakhi Khatwani wrote:
>
>> Hi,
>>        i m writing a map reduce program which reads a file from HDFS and
>> stores the contents in a static map (declared n initialized before
>> executing
>> map reduce). but however after executing the map-reduce program, my map
>> returns 0 elements.  is there any way i can make the data persistent in
>> the
>> map?
>> Regards,
>> Raakhi Khatwani
>>
>>
>


-- 
Thanks & Regards,
Chandra Prakash Bhagtani,

Reply via email to