There isn't anything natively supported for that in the framework, but you can do that yourselves by using a shared service (for e.g via HDFS files, ZooKeeper nodes) that mappers/reducers all have access to.
More details on your usecase? In any case, once you start making mappers and reducers depend on either externally changing state or inter-dependence, you may be breaking fundamental assumptions of MapReduce - embarrassingly parallel computation (limiting scalability) and/or idempotency (affecting retries during failures). Thanks, +Vinod On Jan 2, 2014, at 1:42 AM, sam liu <[email protected]> wrote: > Hi, > > As I know, the Distributed Cache will copy the shared data to the slaves > before starting job, and won't change the shared data after that. > > So are there any solutions to share dynamic data among mappers/reducers? > > Thanks! -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
