Owen O'Malley wrote:

On Sep 9, 2007, at 11:18 PM, Jeff Hammerbacher wrote:

What's the DistributedCache for, in words?

It is for distributing large read-only files that need to be available to each task in the job. I've added an entry for it at the bottom of

http://wiki.apache.org/lucene-hadoop/FAQ

The answer needs more meat about how to set it up, but at least I started the entry.

We should really improve the javadoc for this and link to it. The javadoc should be good reference documentation, but is not currently. The wiki and website should provide "user guide" style documentation, but we should primarily rely on javadoc for reference. Thus the class-level documentation in DistributedCache.java should describe how its configured, and link to other relevant javadocs (e.g., command line programs that add files to the cache).

Doug

Reply via email to