Ah ok.. Then I think you'll have to fire separate jobs. But they can all be fired from inside one parent job - the method I explained earlier. Try that out...
On Fri, Sep 4, 2009 at 12:22 PM, Boyu Zhang <[email protected]> wrote: > Yes, the output of the first iteration is the input of the second > iteration. > Actually, I am trying the page ranking problem. In the algorithm, you have > to run several iterations each using the output of previous iteration as > input and producing the output for latter. > > It is not a real life application, I just want to try some applications > with > iterations. Thanks a lot! > > Boyu > > On Fri, Sep 4, 2009 at 2:51 PM, Amandeep Khurana <[email protected]> wrote: > > > Wait.. Why are you using the same mapper and reducer and calling it 10 > > times? Is the output of the first iteration being input into the second > > one? > > What are these jobs doing? Tell a bit more about that. There might be a > way > > by which you can club some jobs together into one job and reduce the > > overheads... > > > > > > Amandeep Khurana > > Computer Science Graduate Student > > University of California, Santa Cruz > > > > > > On Fri, Sep 4, 2009 at 11:48 AM, Boyu Zhang <[email protected]> > wrote: > > > > > Dear Amandeep, > > > > > > Thanks for the fast reply. I will try the method you mentioned. > > > > > > In my understanding, when a job is submitted, there will be a separate > > > java > > > process in jobtracker responsible for that job. And there will be an > > > initialization and cleanup cost for each job. If every iteration is a > new > > > job, they will be created sequentially by the jobtracker. Say, there > are > > 10 > > > iterations in my code, there will be 10 jobs submitted to the > jobtracker. > > I > > > am just thinking is there a way to just submit 1 job, but run 10 > > > iterations, since they are using the same mapper and reducer classes. > > That > > > is basiclly why I think they are costly, maybe there is something that > I > > > misunderstood. I hope you could share it with me if I was wrong. > > > > > > Again, thanks a lot for replying! > > > > > > Boyu > > > > > > On Fri, Sep 4, 2009 at 2:39 PM, Amandeep Khurana <[email protected]> > > wrote: > > > > > > > You can create different mapper and reducer classes and create > separate > > > job > > > > configs for them. You can pass these different configs to the Tool > > object > > > > in > > > > the same parent class... But they will essentially be different jobs > > > being > > > > called together from inside the same java parent class. > > > > > > > > Why do you say it costs a lot? Whats the issue?? > > > > > > > > > > > > Amandeep Khurana > > > > Computer Science Graduate Student > > > > University of California, Santa Cruz > > > > > > > > > > > > On Fri, Sep 4, 2009 at 11:36 AM, Boyu Zhang <[email protected]> > > > wrote: > > > > > > > > > Dear All, > > > > > > > > > > I am using Hadoop 0.20.0. I have an application that needs to run > > > > > map-reduce > > > > > functions iteratively. Right now, the way I am doing this is new a > > Job > > > > for > > > > > each pass of the map-reduce. That seems cost a lot. Is there any > way > > to > > > > run > > > > > map-reduce functions iteratively in one Job? > > > > > > > > > > Thanks a lot for your time! > > > > > > > > > > Boyu Zhang(Emma) > > > > > > > > > > > > > > >
