Of course you are quite correct.
On job submission from a Map you would need to check if the Job has
been submitted from elsewhere (a previously killed job etc).

The only time I think this might be useful is to prioritise the
secondary jobs higher than the primary first pass... otherwise there
is probably no benefit over just waiting for the first pass to finish.



On Sat, Dec 6, 2008 at 6:41 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
>
>
>
> On 12/6/08 10:43 PM, "tim robertson" <[EMAIL PROTECTED]> wrote:
>
>> I don't agree that this would be considered unconventional, as I have
>> scenarios where this makes sense too - one file with a summary view,
>> and others that are very detailed and a pass over the first one
>> determines which ones to analyse properly in a second job.
>>
> If you're running the first job to do just the first pass (the output of
> which is the list of documents that you want to analyze properly in the
> second job), then yes, this is okay (and this is what I hinted to in my
> earlier mail). However, if in your first job itself, you want to launch the
> second job, this would be unconventional IMO. Things may not be determistic
> - for example, take a case where a map from the first job launches the
> second job, and then the map dies for whatever reason. The second execution
> of the same task (Hadoop would launch a second attempt) would launch the
> second job again and this may not be what you want...
>
>> I am a novice, but it looks like the slaves know about the Master
>> NameNode and JobTracker (in the Masters file), so it I think it is
>> worth trying.
>>
>> Cheers,
>> Tim
>>
>>
>> On Sat, Dec 6, 2008 at 5:17 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
>>>
>>>
>>>
>>> On 12/6/08 2:42 PM, "deng chao" <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>> we have met a case need your help
>>>> The case: In the Mapper class, named MapperA, we define a map() function,
>>>> and in this map() function, we want to submit another new job, named jobB.
>>>> does hadoop support this case?
>>>
>>> Although you can, the design of your application would be unconventional.
>>> Please see if you can redesign your application so that it doesn't have to
>>> do this. Couldn't you run some algorithm on the client side, and depending
>>> on that output, submit a job? The other case you might want to consider is
>>> to have a series of jobs where the output of one job is the input of
>>> another..
>>>
>>>
>>>
>
>
>

Reply via email to