That's correct, I don't really want to have the client have to package up every accumulo and zookeeper jar I need in dcache or a fat jar or whatever just to run stuff from a remote client when the jars are all there.
I did try yarn.application.classpath, but I didn't spell out the whole thing. Next try I will take all those jars and put them in explicitly instead of the dir wildcards. I will update how it goes. On Fri, Jan 23, 2015 at 5:19 PM, Billie Rinaldi <[email protected]> wrote: > You have all the jars your app needs on both the servers and the client > (as opposed to wanting Yarn to distribute them)? Then > yarn.application.classpath should be what you need. It looks like > /etc/hadoop/conf,/some/lib/dir/*,/some/other/lib/dir/* etc. Is that what > you're trying? > > On Fri, Jan 23, 2015 at 1:56 PM, Marc Reichman < > [email protected]> wrote: > >> My apologies if this is covered somewhere, I've done a lot of searching >> and come up dry. >> >> I am migrating a set of applications from Hadoop 1.0.3/Accumulo 1.4.1 to >> Hadoop 2.6.0/Accumulo 1.6.1. The applications are launched by my custom >> java apps, using the Hadoop Tool/Configured interface setup, not a big deal. >> >> To run MR jobs with AccumuloInputFormat/OutputFormat, in 1.0 I could use >> tool.sh to launch the programs, which worked great for local on-cluster >> launching. I however needed to launch from remote hosts (maybe even Windows >> ones), and I would bundle a large lib dir with everything I needed on the >> client-side, and fill out HADOOP_CLASSPATH in hadoop-env.sh with everything >> I needed (basically copied the output of accumulo classpath). This would >> work for remote submissions, or even local ones, but specifically using my >> java mains to launch them without any accumulo or hadoop wrapper scripts. >> >> In YARN MR 2.6 this doesn't seem to work. No matter what I do, I can't >> seem to get a normal java app to have the 2.x MR Application Master pick up >> the accumulo items in the classpath, and my jobs fail with ClassNotFound >> exceptions. tool.sh works just fine, but again, I need to be able to submit >> without that environment. >> >> I have tried (on the cluster): >> HADOOP_CLASSPATH in hadoop-env.sh >> HADOOP_CLASSPATH from .bashrc >> yarn.application.classpath in yarn-site.xml >> >> I don't mind using tool.sh locally, it's quite nice, but I need a >> strategy to have the cluster "setup" so I can just launch java, set my >> appropriate hadoop configs for remote fs and yarn hosts, get my accumulo >> connections and in/out setup for mapreduce and launch jobs which have >> accumulo awareness. >> >> Any ideas? >> >> Thanks, >> Marc >> > >
