[jira] [Updated] (OOZIE-2536) Hadoop's cleanup of local directory in uber mode causing failures
[ https://issues.apache.org/jira/browse/OOZIE-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated OOZIE-2536: -- Fix Version/s: (was: 5.0.0) 4.3.0 Committed this to 4.3.0 as well as it fixes hung launcher jobs in uber mode due to race condition. > Hadoop's cleanup of local directory in uber mode causing failures > - > > Key: OOZIE-2536 > URL: https://issues.apache.org/jira/browse/OOZIE-2536 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Blocker > Fix For: 4.3.0 > > Attachments: OOZIE-2536-1.patch > > > In out environment, we faced an issue where uberized Shell action was getting > stuck even though the shell action got completed with status 0. Please refer > the attached syslog and stdout if launcher job, here I point out partially > stdout : > {quote} > >>> Invoking Shell command line now >> > Stdoutput myshellType=qmyshellUpdate > Exit code of the Shell command 0 > <<< Invocation of Shell command completed <<< > <<< Invocation of Main class completed <<< > {quote} > syslog > {quote} > 2016-05-23 11:15:52,587 WARN [uber-SubtaskRunner] > org.apache.hadoop.mapred.LocalContainerLauncher: Unable to delete unexpected > local file/dir .action.xml.crc: insufficient permissions? > 2016-05-23 11:15:52,588 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.conf.Configuration: error parsing conf propagation-conf.xml > java.io.FileNotFoundException: > /tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_01/propagation-conf.xml > (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at java.io.FileInputStream.(FileInputStream.java:93) > at > sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90) > at > sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188) > at java.net.URL.openStream(URL.java:1038) > at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:981) > at > org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1251) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:568) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.updateMillisCounters(TaskAttemptImpl.java:1295) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createJobCounterUpdateEventTASucceeded(TaskAttemptImpl.java:1323) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.access$3500(TaskAttemptImpl.java:147) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1710) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1701) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1085) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1394) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1386) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > 2016-05-23 11:15:52,590 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.RuntimeException:
[jira] [Updated] (OOZIE-2536) Hadoop's cleanup of local directory in uber mode causing failures
[ https://issues.apache.org/jira/browse/OOZIE-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2536: --- Attachment: OOZIE-2536-1.patch > Hadoop's cleanup of local directory in uber mode causing failures > - > > Key: OOZIE-2536 > URL: https://issues.apache.org/jira/browse/OOZIE-2536 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Blocker > Fix For: 5.0.0 > > Attachments: OOZIE-2536-1.patch > > > In out environment, we faced an issue where uberized Shell action was getting > stuck even though the shell action got completed with status 0. Please refer > the attached syslog and stdout if launcher job, here I point out partially > stdout : > {quote} > >>> Invoking Shell command line now >> > Stdoutput myshellType=qmyshellUpdate > Exit code of the Shell command 0 > <<< Invocation of Shell command completed <<< > <<< Invocation of Main class completed <<< > {quote} > syslog > {quote} > 2016-05-23 11:15:52,587 WARN [uber-SubtaskRunner] > org.apache.hadoop.mapred.LocalContainerLauncher: Unable to delete unexpected > local file/dir .action.xml.crc: insufficient permissions? > 2016-05-23 11:15:52,588 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.conf.Configuration: error parsing conf propagation-conf.xml > java.io.FileNotFoundException: > /tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_01/propagation-conf.xml > (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at java.io.FileInputStream.(FileInputStream.java:93) > at > sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90) > at > sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188) > at java.net.URL.openStream(URL.java:1038) > at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:981) > at > org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1251) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:568) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.updateMillisCounters(TaskAttemptImpl.java:1295) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createJobCounterUpdateEventTASucceeded(TaskAttemptImpl.java:1323) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.access$3500(TaskAttemptImpl.java:147) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1710) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1701) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1085) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1394) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1386) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > 2016-05-23 11:15:52,590 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.RuntimeException: java.io.FileNotFoundException: >
[jira] [Updated] (OOZIE-2536) Hadoop's cleanup of local directory in uber mode causing failures
[ https://issues.apache.org/jira/browse/OOZIE-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Bafna updated OOZIE-2536: -- Fix Version/s: (was: 4.3.0) 5.0.0 > Hadoop's cleanup of local directory in uber mode causing failures > - > > Key: OOZIE-2536 > URL: https://issues.apache.org/jira/browse/OOZIE-2536 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Blocker > Fix For: 5.0.0 > > > In out environment, we faced an issue where uberized Shell action was getting > stuck even though the shell action got completed with status 0. Please refer > the attached syslog and stdout if launcher job, here I point out partially > stdout : > {quote} > >>> Invoking Shell command line now >> > Stdoutput myshellType=qmyshellUpdate > Exit code of the Shell command 0 > <<< Invocation of Shell command completed <<< > <<< Invocation of Main class completed <<< > {quote} > syslog > {quote} > 2016-05-23 11:15:52,587 WARN [uber-SubtaskRunner] > org.apache.hadoop.mapred.LocalContainerLauncher: Unable to delete unexpected > local file/dir .action.xml.crc: insufficient permissions? > 2016-05-23 11:15:52,588 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.conf.Configuration: error parsing conf propagation-conf.xml > java.io.FileNotFoundException: > /tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_01/propagation-conf.xml > (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at java.io.FileInputStream.(FileInputStream.java:93) > at > sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90) > at > sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188) > at java.net.URL.openStream(URL.java:1038) > at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:981) > at > org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1251) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:568) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.updateMillisCounters(TaskAttemptImpl.java:1295) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createJobCounterUpdateEventTASucceeded(TaskAttemptImpl.java:1323) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.access$3500(TaskAttemptImpl.java:147) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1710) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1701) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1085) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1394) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1386) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > 2016-05-23 11:15:52,590 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.RuntimeException: java.io.FileNotFoundException: > /grid/5/tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_01/propagation-conf.xml > (No such file or