[
https://issues.apache.org/jira/browse/OOZIE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682423#comment-14682423
]
Rohini Palaniswamy commented on OOZIE-2329:
-------------------------------------------
Few comments:
- Change oozie.action.yarn.am.restart.kill.oldjob to
oozie.action.launcher.am.restart.kill.childjobs
- I think we can skip the log message as LauncherMainHadoopUtils already
logs Could not find Yarn tags property.
- Description :
Multiple instances of launcher jobs can happen due to RM non-work
preserving recovery on RM restart, AM recovery due to crashes or AM network
connectivity loss. This could also lead to orphaned child jobs of the old AM
attempts leading to conflicting runs. This kills child jobs of previous
attempts using YARN application tags.
> Make handling yarn restarts configurable
> ----------------------------------------
>
> Key: OOZIE-2329
> URL: https://issues.apache.org/jira/browse/OOZIE-2329
> Project: Oozie
> Issue Type: Bug
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Attachments: OOZIE-2329-V1.patch
>
>
> For a cluster with lot of RUNNING jobs, this overwhelms RM causing lot of
> blocked threads as it does ACL check on each application before applying the
> filter tag and that is inside a synchronized block. Hadoop team is looking
> at doing the filtering first outside synchronized block to limit the number
> of applications for which ACL check needs to be done reducing time spent in
> synchronized block.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)