Hi Harsh, 2011/5/3 Harsh J <[email protected]>: > Am moving this to hbase-user, since its more relevant to HBase here > than MR's typical job submissions.
I figured this is a generic problem in getting additional libraries pushed along towards the task trackers. That is why I posted in to the mr-user list. > My reply below: > On Tue, May 3, 2011 at 7:12 PM, Niels Basjes <[email protected]> wrote: >> I've written my first very simple job that does something with hbase. >> >> Now when I try to submit my jar in my cluster I get this: >> >> [nbasjes@master ~/src/catalogloader/run]$ hadoop jar >> catalogloader-1.0-SNAPSHOT.jar nl.basjes.catalogloader.Loader >> /user/nbasjes/Minicatalog.xml >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/hadoop/hbase/HBaseConfiguration ... > The best way to write a Job Driver for HBase would be to use its > TableMapReduceUtil class to make it add dependent jars, prepare jobs > with a Scan, etc. [1]. > > Once your driver reflects the use of TableMapReduceUtil, simply do > (assuming HBase's bin/ is on PATH as well): > $ HADOOP_CLASSPATH=`hbase classpath` hadoop jar > nl.basjes.catalogloader.Loader /user/nbasjes/Minicatalog.xml Sounds good, but it also sounds like HBase has a utility to work around an omission in the base Hadoop MR platform. I'll give it a try. > If you would still like to use -libjars to add in aux jars, make your > Driver use the GenericOptionsParser class [2]. Something like: > > main(args) { > parser = new GenericOptionsParser(args); > conf = parser.getConfiguration(); > rem_args = parser.getRemainingArgs(); > // Do extra args processing if any.. > // use 'conf' for your Job, not a new instance. > } As far as I understood implementing "Tool" is the way to go with hadoop 0.20 and newer. So my current boilerplate looks like this (snipped useless parts): =============== public class Loader extends Configured implements Tool { ... SNIP: my ImportMapper class ... @Override public int run(String[] args) throws Exception { Configuration config = getConf(); config.set(TableOutputFormat.OUTPUT_TABLE, "products"); Job job = new Job(config, "Import product catalog"); job.setJarByClass(this.getClass()); String input = args[0]; TextInputFormat.setInputPaths(job, new Path(input)); job.setInputFormatClass(TextInputFormat.class); job.setMapperClass(ImportMapper.class); job.setNumReduceTasks(0); job.setOutputFormatClass(TableOutputFormat.class); job.waitForCompletion(true); return 0; } public static void main(String[] args) throws Exception { Configuration config = HBaseConfiguration.create(); int result = ToolRunner.run(config, new Loader(), args); System.exit(result); } } =============== Where did I go wrong? -- Met vriendelijke groeten, Niels Basjes
