maybe you can use
bin/hadoop jar -libjars ${your-depends-jars} your.mapred.jar argssee details: http://hadoop.apache.org/core/docs/r0.18.1/api/org/apache/hadoop/mapred/JobShell.html On Thu, Sep 25, 2008 at 12:26 PM, David Hall <[EMAIL PROTECTED]>wrote: > On Sun, Sep 21, 2008 at 9:41 PM, David Hall <[EMAIL PROTECTED]> > wrote: > > On Sun, Sep 21, 2008 at 9:35 PM, Arun C Murthy <[EMAIL PROTECTED]> > wrote: > >> > >> On Sep 21, 2008, at 2:05 PM, David Hall wrote: > >> > >>> (New to this list) > >>> > >>> Hi, > >>> > >>> My research group is setting up a small (20-node) cluster. All of > >>> these machines are linked by NFS. We have a fairly entrenched > >>> codebase/development cycle, and in particular we'd like to be able to > >>> access user $CLASSPATHs in the forked jvms run by the Map and Reduce > >>> tasks. However, TaskRunner.java (http://tinyurl.com/4enkg4) seems to > >>> disallow this by specifying it's own. > >>> > >> > >> Using jars on NFS for too many tasks might hurt if you have thousands of > >> tasks, causing too much load. > >> > >> The better solution might be to use the DistributedCache: > >> > http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache > >> > >> Specifically: > >> > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addArchiveToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)<http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addArchiveToClassPath%28org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration%29> > >> > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addFileToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)<http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html#addFileToClassPath%28org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration%29> > >> > >> Arun > > > > Good point.. I hadn't thought of that, but at the moment we're dealing > > with barrier-to-adoption rather than efficiency. We'll have to go back > > to PBS if we can't get users (read: picky phd students) on board. I'd > > rather avoid that scenario... > > > > In the meantime, I think I figured out a hack that I'm going to try. > > In case anyone's curious, the hack is to create a jar file with a > manifest that has the Class-Path field set to all the directories and > jars you want, and to put that in the lib/ folder of another jar, and > pass that final jar in as the User Jar to a job. > > Works like a charm. :-) > > -- David > > > > > Thanks! > > > > -- David > > > >> > >>> Is there any easy way to "trick" hadoop into making these visible? If > >>> not, if I were to submit a patch that would (optionally) add > >>> $CLASSPATH to the forked jvms' classpath, would it be considered? > >>> > >>> Thanks, > >>> David Hall > >> > >> > > >
