On 09/25/2010 10:24 AM, Martin Becker wrote:
Hello David,

thanks a lot. Yet I want java code to submit my application. I do not
want to mess with any kind of command line arguments or an executable,
neither Java nor Hadoop. I want to write a method that can set up and
submit a job to an arbitrary cluster. Something like calling
CustomJob.submitJob(ip:port). This would be used by a GUI or another
java application to process data.

Also easy to do.

All that the -jt and -fs parms do is eventually set appropriate values in a Configuration object. You can just as easily do this programmatically.

i.e., for -jt:

conf.set("mapred.job.tracker", <hostname_of_job_tracker_in_remote_cluster:job_tracker_port_number>);

for -fs:
conf.set("fs.default.name", <hdfs://hostname_of_name_node_in_remote_cluster:name_node_port_number>);


On a side note: as far as your requirement "I do not want to mess with any kind of command line arguments or an executable, neither Java nor Hadoop", I'm not sure how feasible this requirement is because of the -libjars command line parm.

It's easy to write code to handle what the -fs and -jt command line parms do (see above). But -libjars is much more complicated. It takes the list of jars that you give it, sends each one to the Hadoop DistributedCache so that it gets distributed to each node in the cluster, and then adds each one to the classpath of each map/reduce tasks.

Yes, I suppose you could try to write code to do that yourself, but I can't see why you would want to reinvent the wheel here. So this makes me question whether that requirement really makes sense.


The problem is the the
missing job.jar, as also described earlier. I will start a new thread
describing my problem using a more accurate header.

Thank you,
Martin

I'm not sure what you're referring to re: job.jar.

DR

Reply via email to