Re: What are the methods to share dynamic data among mappers/reducers?

Vinod Kumar Vavilapalli Thu, 02 Jan 2014 10:23:09 -0800

There isn't anything natively supported for that in the framework, but you can 
do that yourselves by using a shared service (for e.g via HDFS files, ZooKeeper 
nodes) that mappers/reducers all have access to.

More details on your usecase? In any case, once you start making mappers and 
reducers depend on either externally changing state or inter-dependence, you 
may be breaking fundamental assumptions of MapReduce - embarrassingly parallel 
computation (limiting scalability) and/or idempotency (affecting retries during 
failures).

Thanks,
+Vinod

On Jan 2, 2014, at 1:42 AM, sam liu <[email protected]> wrote:

> Hi,
> 
> As I know, the Distributed Cache will copy the shared data to the slaves 
> before starting job, and won't change the shared data after that. 
> 
> So are there any solutions to share dynamic data among mappers/reducers?
> 
> Thanks!

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: What are the methods to share dynamic data among mappers/reducers?

Reply via email to