thank you for all the help. I think I am beginning to gain a more clear picture of hadoop. I will try the file solution.
On 10/29/07, Aaron Kimball <[EMAIL PROTECTED]> wrote: > If you modify the JobConf for a running job within the context of a > mapper, the changes will not propagate back to the other machines. > JobConfs are serialized to XML and then distributed to the mapping nodes > where they are read back into the running Java tasks. There is no > "refresh" function that I am aware of. > > - Aaron > > Jim the Standing Bear wrote: > > Thanks, Stu... Maybe my mind is way off track - but I still sense a > > problem with the mapper sending feedbacks to the job controller. That > > is, when a mapper has reached the terminal condition, how can it tell > > the job controller to stop? > > > > If I keep a JobConf object in the mapper, and set a property > > "stop.processing" to true when a mapping task has reached the terminal > > condition, will it cause synchronization problems? There could be > > other mapping tasks that still wish to go on? > > > > I tried to find a way so that the job controller can open the file in > > the output path at the end of the loop to read the contents; but thus > > far, I haven't seen a way to achieve this. > > > > Does this mean I have hit a dead-end? > > > > -- Jim > > > > > > > > On 10/29/07, Stu Hood <[EMAIL PROTECTED]> wrote: > > > >> The iteration would take place in your control code (your 'main' method, > >> as shown in the examples). > >> > >> In order to prevent records from looping infinitely, each iteration would > >> need to use a separate output/input directory. > >> > >> Thanks, > >> Stu > >> > >> > >> -----Original Message----- > >> From: Jim the Standing Bear <[EMAIL PROTECTED]> > >> Sent: Monday, October 29, 2007 5:45pm > >> To: [email protected] > >> Subject: Re: can jobs be launched recursively within a mapper ? > >> > >> thanks, Owen and David, > >> > >> I also thought of making a queue so that I can push catalog names to > >> the end of it, while the job control loop keeps removing items off the > >> queue until there is no more left. > >> > >> However, the problem is I don't see how I can do so within the > >> map/reduce context. All the code examples are one-shot deals and > >> there is no iteration involved. > >> > >> Furthermore, what David said made sense, but to avoid infinite loop, > >> the code must remove the record it just read from the input file. How > >> do I do that using hadoop's fs? or does hadoop take care of it > >> automatically? > >> > >> -- Jim > >> > >> > >> > >> On 10/29/07, David Balatero <[EMAIL PROTECTED]> wrote: > >> > >>> Aren't these questions a little advanced for a bear to be asking? > >>> I'll be here all night... > >>> > >>> But seriously, if your job is inherently recursive, one possible way > >>> to do it would be to make sure that you output in the same format > >>> that you input. Then you can keep re-reading the outputted file back > >>> into a new map/reduce job, until you hit some base case and you > >>> terminate. I've had a main method before that would kick off a bunch > >>> of jobs in a row -- but I wouldn't really recommend starting another > >>> map/reduce job in the scope of a running map() or reduce() method. > >>> > >>> - David > >>> > >>> > >>> On Oct 29, 2007, at 2:17 PM, Jim the Standing Bear wrote: > >>> > >>> > >>>> then > >>>> > >>> > >> -- > >> -------------------------------------- > >> Standing Bear Has Spoken > >> -------------------------------------- > >> > >> > >> > >> > > > > > > > -- -------------------------------------- Standing Bear Has Spoken --------------------------------------
