On Thu, Aug 12, 2010 at 7:57 AM, David Rosenstrauch <dar...@darose.net> wrote: > On 08/11/2010 08:08 PM, Aaron Kimball wrote: >> >> On Wed, Aug 11, 2010 at 3:13 PM, David >> Rosenstrauch<dar...@darose.net>wrote: >> >>> What's the preferred way to submit a job these days? >>> org.apache.hadoop.mapreduce.Job.submit() ? Or >>> org.apache.hadoop.mapred.JobClient.runJob()? Or does it even matter? >>> (i.e., >>> is there any difference between them?) >>> >>> >> If you're using the old API (e.g., you're filling out >> o.a.h.mapred.JobConf, >> and implementing o.a.h.mapred.Mapper) then you use JobClient.runJob(). If >> you're using the new API (o.a.h.mapreduce.Job, o.a.h.mapreduce.Mapper), >> then >> you use Job.waitForCompletion(). >> >> You can't mix'n'match; your job has to be entirely "old style" or entirely >> "new style." Some programs use one, some use the other. > > OK, so I'm not insane then. :-) That's how I thought it worked. > > >>> On a related note, if there's actually no difference between the 2 >>> methods, >>> would anybody have any idea what could make the "mapred.job.tracker" >>> setting >>> on a job Configuration get ignored? (I currently have it set to >>> "hdfs://<hadoop_job_tracker_host_name>:9001".) >>> >>> >> There's a reason that's being ignored :) That is not a jobtracker address. >> Assuming you've configured your namenode and your jobtracker on the same >> machine, then your fs.default.name should be hdfs://hdfs.host.name:port, >> and >> mapred.job.tracker should just be jt.host.name:port >> >> The port numbers in these two cases will be different. > > Hmmmm ... OK. Not sure I understand why the syntax is different for thosee > 2 settings, but I'll give that a shot and see if it fixes the problem. Its probably because the JT has nothing to do with the HDFS protocols. Giving a hdfs:// scheme in its URI won't make sense :) > > Thanks much for the help! > > DR >
-- Harsh J www.harshj.com