On 09/14/2010 10:10 PM, Pete Tyler wrote:
I'm trying to figure out how to achieve the following from a Java client,
1. My app (which is a web server) starts up
2. As part of startup my jar file, which includes my map reduce classes are 
distributed to hadoop nodes
3. My web app uses map reduce to extract data without the performance overhead 
of each job deploying a jar file, via setJar(), setJarByClass()

It looks like DistributedCache() has potential but the need for commands like 
'hadoop fs -copyFromLocal ...' and the API methods like 
'.getLocalCacheArchives()' look to be at odds with my scenario. Any thoughts?

-Peter

For step 2, you have 2 options on how to implement:
a) call DistributedCache.addFileToClassPath(jarFileURI, conf);
b) have your app implement Tool, use ToolRunner to launch it, and specify a -libjars command line parm which will achieve the same effect as in (a). See http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/util/Tool.html and http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/util/GenericOptionsParser.html#GenericOptions for details.

HTH,

DR

Reply via email to