On 12/6/08 10:43 PM, "tim robertson" <[EMAIL PROTECTED]> wrote:

> I don't agree that this would be considered unconventional, as I have
> scenarios where this makes sense too - one file with a summary view,
> and others that are very detailed and a pass over the first one
> determines which ones to analyse properly in a second job.
> 
If you're running the first job to do just the first pass (the output of
which is the list of documents that you want to analyze properly in the
second job), then yes, this is okay (and this is what I hinted to in my
earlier mail). However, if in your first job itself, you want to launch the
second job, this would be unconventional IMO. Things may not be determistic
- for example, take a case where a map from the first job launches the
second job, and then the map dies for whatever reason. The second execution
of the same task (Hadoop would launch a second attempt) would launch the
second job again and this may not be what you want...

> I am a novice, but it looks like the slaves know about the Master
> NameNode and JobTracker (in the Masters file), so it I think it is
> worth trying.
> 
> Cheers,
> Tim
> 
> 
> On Sat, Dec 6, 2008 at 5:17 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
>> 
>> 
>> 
>> On 12/6/08 2:42 PM, "deng chao" <[EMAIL PROTECTED]> wrote:
>> 
>>> Hi,
>>> we have met a case need your help
>>> The case: In the Mapper class, named MapperA, we define a map() function,
>>> and in this map() function, we want to submit another new job, named jobB.
>>> does hadoop support this case?
>> 
>> Although you can, the design of your application would be unconventional.
>> Please see if you can redesign your application so that it doesn't have to
>> do this. Couldn't you run some algorithm on the client side, and depending
>> on that output, submit a job? The other case you might want to consider is
>> to have a series of jobs where the output of one job is the input of
>> another..
>> 
>> 
>> 


Reply via email to