Hi Praveen, looking at the Job configuration you will find properties like user.name and more stuff that has created by substituting template values in core-default.xml, mapred-default.xml (all in the hadoop jars). I suppose on of these (if not user.name) define the user that submits. But I haven't tried and I am sure others know better.
Why is that actually important? Why not submit as the user you are? About submitting multiple jars: AFAIK the standard way is to submit everything in one jar. Henning ps.: We are developing something based on www.z2-environment.eu that will complement Hadoop with automatic on-demand update on the task node. But it's not public yet. On Wed, 2010-11-24 at 00:10 +0100, praveen.pe...@nokia.com wrote: > Hi Henning, > Putting core-site.xml in classpath worked. Thanks for the help. I need > to figure how to submit a job as a different user than the user hadoop > is configured for. > > I have one more related to job submission. Did anyone face problem > with running job that involves multiple jar files. I am running a map > reduce job that references multiple jar files. When I run the job I > always get ClassNotFoundException on the class that is not in the jar > file that job class is present. > > I am starting the jobs from a java application and am getting > ClassNotFoundException. > > java.lang.RuntimeException: java.lang.ClassNotFoundException: > com.nokia.relevancy.util.hadoop.ValueOnlyTextOutputFormat > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809) > at > org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193) > at org.apache.hadoop.mapred.Task.initialize(Task.java:413) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > Caused by: java.lang.ClassNotFoundException: > com.nokia.relevancy.util.hadoop.ValueOnlyTextOutputFormat > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > at sun.misc.Launcher > $AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247) > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) > ... 4 more > > Praveen > > > ______________________________________________________________________ > From: ext Henning Blohm [mailto:henning.bl...@zfabrik.de] > Sent: Tuesday, November 23, 2010 11:37 AM > To: mapreduce-user@hadoop.apache.org > Subject: RE: Starting a Hadoop job programtically > > > > > Hi Praveen, > > On Tue, 2010-11-23 at 17:18 +0100, praveen.pe...@nokia.com wrote: > > > Hi Henning, > > adding hadoop's conf folder didn't help fixing the issue but when I > > added the two below properties, I was able to access file system but > > cannot write anything due to different user. I have following > > questions based on experiments. > > > Exaclty. I didn't mean to add the whole folder. Just the one file with > those props. > > > > 1. How can I access HDFS or submit jobs as different user than my > > java app is running. For example, Hadoop cluster is setup for > > "hadoop" user and my java app is runnign as different user. In order > > to run the job correctly, I have to submit it as "hadoop" user. > > correct? How to achive it programitcally? > > > We always run everything with the same user (now that you mention it). > Didn't know that we would have a problem otherwise. I would have > suspected that the submitting user doesn't matter (setting the > corresponding system property would probably override that one > anyway). > > > > 2. Few of the jobs I am calling is provided by the library which > > means I cannot add these two config properties myself. Is there any > > way around this other than replicating the job submission code from > > the library to locally? > > > Yes, I think creating a core-site.xml file as below, putting it into > <folder> (any folder you like will do) and adding <folder> to your > classpath when submitting should do the trick (as I tried to explain > before and if I am not mistaken). > > > > Thanks > > Praveen > > > Good luck, > Henning > > > > > > > > ____________________________________________________________________ > > > > From: ext Henning Blohm [mailto:henning.bl...@zfabrik.de] > > Sent: Tuesday, November 23, 2010 3:24 AM > > To: mapreduce-user@hadoop.apache.org > > Subject: RE: Starting a Hadoop job programtically > > > > > > > > Hi Praveen, > > > > in order to submit it to the cluster, you just need to have a > > core-site.xml on your classpath (or load it explicitly into your > > configuration object) that looks (at least) like this > > > > <configuration> > > <property> > > <name>fs.default.name</name> > > <value>hdfs://${name:port of namenode}</value> > > </property> > > > > <property> > > <name>mapred.job.tracker</name> > > <value>${name:port of jobtracker}</value> > > </property> > > </configuration> > > > > If you want to wait for each job's completion, you can use > > job.waitForCompletion(true) rather than job.submit(). > > > > Good luck, > > henning > > > > > > On Mon, 2010-11-22 at 23:40 +0100, praveen.pe...@nokia.com wrote: > > > > > Hi Thanks for your reply. In my case I have a Driver that calls > > > multiple jobs one after the other. I am using the following code > > > to submit each job but it uses local hadoop jar files that is in > > > the classpath. Its not submitting the job to Hadoop cluster. I > > > thought I would need to specify where the master Hadoop is located > > > on remote machine. Example command I use from command line is as > > > follows but I need to do it from my Java program. > > > $ hadoop-0.20.2/bin/hadoop > > > jar > > > /home/ppeddi/dev/Merchandising/RelevancyEngine/relevancy-core/dist/Relevancy4.jar > > > -i raw-downloads-input-10K -o reco-patterns-output-10K-1S -k 100 -method > > > mapreduce -g 500 -regex '[\ ]' -s 5 > > > > > > > > > I hope I made the question clear now. > > > Praveen > > > > > > > > > __________________________________________________________________ > > > > > > > > > From: ext Henning Blohm [mailto:henning.bl...@zfabrik.de] > > > Sent: Monday, November 22, 2010 5:07 PM > > > To: mapreduce-user@hadoop.apache.org > > > Subject: Re: Starting a Hadoop job programtically > > > > > > > > > > > > Hi Praveen, > > > > > > we do. We are using the "new" org.apache.hadoop.mapreduce.* API > > > in Hadoop 0.20.2. > > > > > > Essentially the flow is: > > > > > > //---- > > > // assuming all config is on the class path > > > Configuration config = new Configuration(); > > > Job job = new Job(config, "some job name"); > > > > > > // set in/out types > > > job.setInputFormatClass(...); > > > job.setOutputFormatClass(...); > > > job.setMapOutputKeyClass(...); > > > job.setMapOutputValueClass(...); > > > job.setOutputKeyClass(...); > > > job.setOutputValueClass(...); > > > > > > // set implementations as required > > > job.setMapperClass(<your mapper implementation class object>); > > > job.setCombinerClass(<your combiner implementation class > > > object>); > > > job.setReducerClass(<your reducer implementation class object>); > > > > > > // set the jar... this is often the tricky part! > > > job.setJarByClass(<some class that is in the job jar and not > > > elsewhere higher up on the class path>); > > > > > > job.submit(); > > > //---- > > > > > > Hope I didn't forget anything. > > > > > > Note: You need to give Hadoop something it can launch in a JVM > > > that has no more but the hadoop jars and whatever else you > > > configured statically in your hadoop-env.sh script. > > > > > > Can you describe your scenario in more detail? > > > > > > Henning > > > > > > > > > Am Montag, den 22.11.2010, 22:39 +0100 schrieb > > > praveen.pe...@nokia.com: > > > > > > > Hi all, > > > > I am trying to figure how I can start a hadoop job > > > > porgramatically from my Java application running in an app > > > > server. I was able to run my map reduce job using hadoop command > > > > from hadoop master machine but my goal is to run the same job > > > > from my java program (running on a different machine than > > > > master). I googled and could not find solution for this. All the > > > > examples I have seen so far are using hadoop from command line > > > > to start a job. > > > > 1. Has anyone called Hadoop job invocation from a Java > > > > application? > > > > 2. If so, could someone provide some sample code. > > > > 3. > > > > Thanks > > > > Praveen > > > > > > > > > > Henning Blohm > > > > > > ZFabrik Software KG > > > > > > henning.bl...@zfabrik.de > > > www.z2-environment.eu > > > > > > > > > > > > > > > > > > > > >