Re: JobClient using deprecated JobConf

Martin Becker Fri, 24 Sep 2010 09:42:44 -0700

 Hello David,

Thanks for your suggestions. I fail to see where your approach isdifferent from the one used in the tutorial. The -libjars option is acommand line option of the Hadoop executable. I do not want to call thatexecutable. Maybe I don't see the point. My implementation is basicallythe same as your template. And using the Hadoop executable with my mainjar and the additional jars loaded by -libjars works fine.


Regards,
Martin


On 24.09.2010 17:29, David Rosenstrauch wrote:

On 09/24/2010 11:12 AM, Martin Becker wrote:

Hi James,

I am trying to avoid to call any command line command. I want to submit
a job from within a java application. If possible without packing any
jar file at all. But I guess that will be necessary to allow Hadoop to
load the specific classes. The tutorial definitely does not contain any
explicit java code how to do this. Sorry, for not stating my problem
clearly:

Right now I want to use Eclipse to submit my job by doing using the "Run
as..." dialog. Later I want to embed that part in a java application
submitting configured jobs to a remote Hadoop system/cluster.

Regards,
Martin


This is very do-able.  (I do this now.)

Here is a skeleton for how it can be done:

public class JobSubmitter implements Tool {
    public static void main(String[] args) throws Exception {
        ToolRunner.run(new Configuration(), new JobSubmitter(), args);
    }

    public JobSubmitter() {
<your code here>
    }

    public Configuration getConf() {
        return appConf;
    }

    public void setConf(Configuration conf) {
        this.appConf = conf;
    }

    public int run(String[] args) throws Exception {
        Job job = new Job(appConf);
        Configuration jobConf = job.getConfiguration();
        jobConf.set(<your code here>);
<your code here>
        job.submit();
    }
}



re: "without packing any jar file at all":

If you use Tool/ToolRunner (as we are doing above), that lets yourHadoop app automatically handle some key command line args. One themthat you will use here is the -libjars argument. If you use -libjarsand specify a list of jars that contain your code, then ToolRunnerwill automatically take those jars and put them in the DistributedCache on each task node, where they will get added to the classpath ofevery map/reduce task.


HTH,

DR

Re: JobClient using deprecated JobConf

Reply via email to