Hi Harsh,

2011/5/3 Harsh J <[email protected]>:
> Am moving this to hbase-user, since its more relevant to HBase here
> than MR's typical job submissions.

I figured this is a generic problem in getting additional libraries
pushed along towards the task trackers. That is why I posted in to the
mr-user list.

> My reply below:

> On Tue, May 3, 2011 at 7:12 PM, Niels Basjes <[email protected]> wrote:
>> I've written my first very simple job that does something with hbase.
>>
>> Now when I try to submit my jar in my cluster I get this:
>>
>> [nbasjes@master ~/src/catalogloader/run]$ hadoop jar
>> catalogloader-1.0-SNAPSHOT.jar nl.basjes.catalogloader.Loader
>> /user/nbasjes/Minicatalog.xml
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/hadoop/hbase/HBaseConfiguration
...

> The best way to write a Job Driver for HBase would be to use its
> TableMapReduceUtil class to make it add dependent jars, prepare jobs
> with a Scan, etc. [1].
>
> Once your driver reflects the use of TableMapReduceUtil, simply do
> (assuming HBase's bin/ is on PATH as well):
> $ HADOOP_CLASSPATH=`hbase classpath` hadoop jar
> nl.basjes.catalogloader.Loader /user/nbasjes/Minicatalog.xml

Sounds good, but it also sounds like HBase has a utility to work
around an omission in the base Hadoop MR platform.
I'll give it a try.

> If you would still like to use -libjars to add in aux jars, make your
> Driver use the GenericOptionsParser class [2]. Something like:
>
> main(args) {
> parser = new GenericOptionsParser(args);
> conf = parser.getConfiguration();
> rem_args = parser.getRemainingArgs();
> // Do extra args processing if any..
> // use 'conf' for your Job, not a new instance.
> }

As far as I understood implementing "Tool" is the way to go with
hadoop 0.20 and newer.
So my current boilerplate looks like this (snipped useless parts):

===============
public class Loader extends Configured implements Tool {
... SNIP: my ImportMapper class ...

    @Override
    public int run(String[] args) throws Exception {
        Configuration config = getConf();
        config.set(TableOutputFormat.OUTPUT_TABLE, "products");
        Job job = new Job(config, "Import product catalog");
        job.setJarByClass(this.getClass());

        String input = args[0];

        TextInputFormat.setInputPaths(job, new Path(input));
        job.setInputFormatClass(TextInputFormat.class);
        job.setMapperClass(ImportMapper.class);
        job.setNumReduceTasks(0);

        job.setOutputFormatClass(TableOutputFormat.class);

        job.waitForCompletion(true);

        return 0;
    }

    public static void main(String[] args) throws Exception {
        Configuration config = HBaseConfiguration.create();
        int result = ToolRunner.run(config, new Loader(), args);
        System.exit(result);
    }
}
===============

Where did I go wrong?

-- 
Met vriendelijke groeten,

Niels Basjes

Reply via email to