On Mon, 06 Jun 2011 16:14:14 -0500, Shi Yu <[email protected]> wrote:
> I still don't understand, in a cluster you have a shared directory to 
> all the nodes, right? Just put the configuration file in that directory 
> and load it in all the mappers, isn't that simple?
> So I still don't understand why bother DistributedCache, the only reason

> might be the shared directory is costly for network and usually has 
> storage limit.

That's exactly the problem the DistributedCache is designed for.  It
guarantees that you only need to copy the file to any given local
filesystem once.  Using the way you suggest, if there are a hundred mappers
on a given node they'd all need to make a local copy of the file instead of
just making one local copy and moving it around locally from then on.

Reply via email to