On Sun, Sep 21, 2008 at 9:35 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > > On Sep 21, 2008, at 2:05 PM, David Hall wrote: > >> (New to this list) >> >> Hi, >> >> My research group is setting up a small (20-node) cluster. All of >> these machines are linked by NFS. We have a fairly entrenched >> codebase/development cycle, and in particular we'd like to be able to >> access user $CLASSPATHs in the forked jvms run by the Map and Reduce >> tasks. However, TaskRunner.java (http://tinyurl.com/4enkg4) seems to >> disallow this by specifying it's own. >> > > Using jars on NFS for too many tasks might hurt if you have thousands of > tasks, causing too much load. > > The better solution might be to use the DistributedCache: > http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache > > Specifically: > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addArchiveToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration) > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addFileToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration) > > Arun
Good point.. I hadn't thought of that, but at the moment we're dealing with barrier-to-adoption rather than efficiency. We'll have to go back to PBS if we can't get users (read: picky phd students) on board. I'd rather avoid that scenario... In the meantime, I think I figured out a hack that I'm going to try. Thanks! -- David > >> Is there any easy way to "trick" hadoop into making these visible? If >> not, if I were to submit a patch that would (optionally) add >> $CLASSPATH to the forked jvms' classpath, would it be considered? >> >> Thanks, >> David Hall > >
