Re: can jobs be launched recursively within a mapper ?

Aaron Kimball Mon, 29 Oct 2007 18:40:42 -0800

If you modify the JobConf for a running job within the context of amapper, the changes will not propagate back to the other machines.JobConfs are serialized to XML and then distributed to the mapping nodeswhere they are read back into the running Java tasks. There is no"refresh" function that I am aware of.


- Aaron


Jim the Standing Bear wrote:

Thanks, Stu...  Maybe my mind is way off track - but I still sense a
problem with the mapper sending feedbacks to the job controller.  That
is, when a mapper has reached the terminal condition, how can it tell
the job controller to stop?

If I keep a JobConf object in the mapper, and set a property
"stop.processing" to true when a mapping task has reached the terminal
condition, will it cause synchronization problems?  There could be
other mapping tasks that still wish to go on?

I tried to find a way so that the job controller can open the file in
the output path at the end of the loop to read the contents; but thus
far, I haven't seen a way to achieve this.

Does this mean I have hit a dead-end?

-- Jim



On 10/29/07, Stu Hood <[EMAIL PROTECTED]> wrote:

The iteration would take place in your control code (your 'main' method, as 
shown in the examples).

In order to prevent records from looping infinitely, each iteration would need 
to use a separate output/input directory.

Thanks,
Stu


-----Original Message-----
From: Jim the Standing Bear <[EMAIL PROTECTED]>
Sent: Monday, October 29, 2007 5:45pm
To: [email protected]
Subject: Re: can jobs be launched recursively within a mapper ?

thanks, Owen and David,

I also thought of making a queue so that I can push catalog names to
the end of it, while the job control loop keeps removing items off the
queue until there is no more left.

However, the problem is I don't see how I can do so within the
map/reduce context.  All the code examples are one-shot deals and
there is no iteration involved.

Furthermore, what David said made sense, but to avoid infinite loop,
the code must remove the record it just read from the input file.  How
do I do that using hadoop's fs?  or does hadoop take care of it
automatically?

-- Jim



On 10/29/07, David Balatero <[EMAIL PROTECTED]> wrote:

Aren't these questions a little advanced for a bear to be asking?
I'll be here all night...

But seriously, if your job is inherently recursive, one possible way
to do it would be to make sure that you output in the same format
that you input. Then you can keep re-reading the outputted file back
into a new map/reduce job, until you hit some base case and you
terminate. I've had a main method before that would kick off a bunch
of jobs in a row -- but I wouldn't really recommend starting another
map/reduce job in the scope of a running map() or reduce() method.

- David


On Oct 29, 2007, at 2:17 PM, Jim the Standing Bear wrote:

then

--
--------------------------------------
Standing Bear Has Spoken
--------------------------------------

Re: can jobs be launched recursively within a mapper ?

Reply via email to