Re: Adding entries to classpath

2010-08-11 Thread Arun C Murthy
Moving to mapreduce-user@, bcc common-u...@. Why do you need to create a single top-level jar? Just register each of your jars and put each in the distributed cache... however you have 150 jars which is a lot. Is there a way you can decrease that? I'm sure how you do this in pig, but in MR

Re: Adding entries to classpath

2010-08-11 Thread Ashutosh Chauhan
Adding pig-user@ Sanjay, You can do this in Pig by setting following -D switch at the command line through which you invoke Pig. -Dpig.streaming.ship.files=myTopLevel.jar In 0.8 release you will be able to do this from within Pig script like set pig.streaming.ship.files myTopLevel.jar; Note tha

Preferred way to submit a job?

2010-08-11 Thread David Rosenstrauch
What's the preferred way to submit a job these days? org.apache.hadoop.mapreduce.Job.submit() ? Or org.apache.hadoop.mapred.JobClient.runJob()? Or does it even matter? (i.e., is there any difference between them?) I've been trying to run a job using org.apache.hadoop.mapreduce.Job.submit()

Re: Preferred way to submit a job?

2010-08-11 Thread Aaron Kimball
On Wed, Aug 11, 2010 at 3:13 PM, David Rosenstrauch wrote: > What's the preferred way to submit a job these days? > org.apache.hadoop.mapreduce.Job.submit() ? Or > org.apache.hadoop.mapred.JobClient.runJob()? Or does it even matter? (i.e., > is there any difference between them?) > > If you're u

Re: mrunit question

2010-08-11 Thread Aaron Kimball
David, Since you are directly instantiating the Mapper and Reducer (not using ReflectionUtils), you are free to call setConf() yourself before you run the test. If you're using the old API (o.a.h.mrunit): Mapper m = new Mapper(); MapDriver d = new MapDriver(m); Configuration conf = new Configura

Re: mrunit question

2010-08-11 Thread David Rosenstrauch
On 08/11/2010 08:14 PM, Aaron Kimball wrote: David, Since you are directly instantiating the Mapper and Reducer (not using ReflectionUtils), you are free to call setConf() yourself before you run the test. Sort of. What would wind up happening is that setConf would get called twice: once by

Re: Preferred way to submit a job?

2010-08-11 Thread David Rosenstrauch
On 08/11/2010 08:08 PM, Aaron Kimball wrote: On Wed, Aug 11, 2010 at 3:13 PM, David Rosenstrauchwrote: What's the preferred way to submit a job these days? org.apache.hadoop.mapreduce.Job.submit() ? Or org.apache.hadoop.mapred.JobClient.runJob()? Or does it even matter? (i.e., is there any di

Re: Preferred way to submit a job?

2010-08-11 Thread Harsh J
On Thu, Aug 12, 2010 at 7:57 AM, David Rosenstrauch wrote: > On 08/11/2010 08:08 PM, Aaron Kimball wrote: >> >> On Wed, Aug 11, 2010 at 3:13 PM, David >> Rosenstrauchwrote: >> >>> What's the preferred way to submit a job these days? >>> org.apache.hadoop.mapreduce.Job.submit() ?  Or >>> org.apache

Re: Listing Hadoop Job History Statistics

2010-08-11 Thread Arun C Murthy
Moving to mapreduce-user@, bcc gene...@. There isn't a direct way. One possible option is just use the per-job job-history file which is on HDFS (See http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Job+Submission+and+Monitoring for info on job-history). Hope that helps. A