In a FairScheduler environment, especially where max-running-job
limits are configured, it is recommended to override the Oozie
launcher job's pool to be different than the actual required working
pool (for actions that launch other MR jobs).
If your scheduler is configured to pick ${user.name} up automatically,
then your Oozie launcher config must use the super-override pool name
config:
oozie.launcher.mapred.fairscheduler.pool=launcherpoolname
Your target pool for launchers can still carry limitations, but it
should no longer deadlock your actual MR execution (after which the
launcher dies away anyway).
Does this help, Matt?
On Wed, Nov 7, 2012 at 2:06 AM, Matt Goeke <[email protected]> wrote:
> All,
>
> I sent an email about this a while ago (deadlock in oozie due to launcher
> over subscription) and we were able to avoid the situation temporarily by
> staggering our coordinators in groups. We are now at a point where the
> overhead of staggering the pools / the cost of maintaining that scheduling
> structure is too high. I know I could avoid this situation if we had a
> larger mapper pool but this is not possible at the moment with the
> available hardware.
>
> After finding a blog post that references submitting the launcher jobs to a
> separate queue (
> http://downright-amazed.blogspot.com/2012/02/configure-oozies-launcher-job.html)
> I became curious if this could alleviate our problems even if we are using
> user based pools in the fair scheduler.
>
> Does anyone have any experience with this or know if this will work? What
> is the practical differentiation of specifying a queue for Oozie when I am
> being directed to a pool already?
>
> --
> Matt
--
Harsh J