Your application logic is likely stuck in a loop.
On Sat, Apr 13, 2013 at 12:47 PM, Chris Hokamp <[email protected]>wrote: > >When you say "never progresses", do you see the MR framework kill it > >automatically after 10 minutes of inactivity or does it never ever > >exit? > > The latter -- it never exits. Killing it manually seems like a good option > for now. We already have mapred.max.map.failures.percent set to a > non-zero value, but because the task never fails, this never comes into > effect. > > Thanks for the help, > Chris > > > On Sat, Apr 13, 2013 at 5:00 PM, Harsh J <[email protected]> wrote: > >> When you say "never progresses", do you see the MR framework kill it >> automatically after 10 minutes of inactivity or does it never ever >> exit? >> >> You can lower the timeout period on tasks via mapred.task.timeout set >> in msec. You could also set mapred.max.map.failures.percent to a >> non-zero value to allow that much percentage of tasks to fail without >> also marking the whole job as a failure. >> >> If the task itself does not get killed by the framework due to >> inactiveness, try doing a hadoop job -fail-task on its attempt ID >> manually. >> >> On Sat, Apr 13, 2013 at 8:45 PM, Chris Hokamp <[email protected]> >> wrote: >> > Hello, >> > >> > We have a job where all mappers finish except for one, which always >> hangs at >> > the same spot (i.e. reaches 49%, then never progresses). >> > >> > This is likely due to a bug in the wiki parser in our Pig UDF. We can >> afford >> > to lose the data this mapper is working on if it would allow the job to >> > finish. Question: is there a hadoop configuration parameter similar to >> > mapred.skip.map.max.skip.records that would let us skip a map that >> doesn't >> > progress after X amount of time? Any other possible workarounds for this >> > case would also be useful. >> > >> > We are currently using hadoop 1.1.0 and Pig 0.10.1. >> > >> > Thanks, >> > Chris >> >> >> >> -- >> Harsh J >> > >
