[jira] [Created] (YARN-1155) RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips
yeshavora created YARN-1155: --- Summary: RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips Key: YARN-1155 URL: https://issues.apache.org/jira/browse/YARN-1155 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1154) RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails
yeshavora created YARN-1154: --- Summary: RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails Key: YARN-1154 URL: https://issues.apache.org/jira/browse/YARN-1154 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1154) RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails
[ https://issues.apache.org/jira/browse/YARN-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora updated YARN-1154: Description: RM should be able to do reverse lookup when NM tries to register. NM registration should fail if lookup fails. RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails -- Key: YARN-1154 URL: https://issues.apache.org/jira/browse/YARN-1154 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Assignee: Xuan Gong RM should be able to do reverse lookup when NM tries to register. NM registration should fail if lookup fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1155) RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips
[ https://issues.apache.org/jira/browse/YARN-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora updated YARN-1155: Description: RM should be able to support both hostnames and ips RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips --- Key: YARN-1155 URL: https://issues.apache.org/jira/browse/YARN-1155 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Assignee: Xuan Gong RM should be able to support both hostnames and ips -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1155) RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips
[ https://issues.apache.org/jira/browse/YARN-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora updated YARN-1155: Description: RM should be able to resolve both ips and host names from include and exclude files. (was: RM should be able to support both hostnames and ips ) RM should resolve hostnames/ips in include/exclude files to support matching against both hostnames and ips --- Key: YARN-1155 URL: https://issues.apache.org/jira/browse/YARN-1155 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Assignee: Xuan Gong RM should be able to resolve both ips and host names from include and exclude files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1154) RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails
[ https://issues.apache.org/jira/browse/YARN-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora updated YARN-1154: Description: While new nodemanager is being added (adding nodename in yarn include file), RM should be able to do reverse lookup when NM tries to register. NM registration should fail if lookup fails. (was: While new nodemanager is being added adding it in include file, RM should be able to do reverse lookup when NM tries to register. NM registration should fail if lookup fails.) RM should check do reverse lookup of NM hostname on registration and disallow registration if lookup fails -- Key: YARN-1154 URL: https://issues.apache.org/jira/browse/YARN-1154 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Assignee: Xuan Gong While new nodemanager is being added (adding nodename in yarn include file), RM should be able to do reverse lookup when NM tries to register. NM registration should fail if lookup fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1129) Job hungs when any node is blacklisted after RMrestart
yeshavora created YARN-1129: --- Summary: Job hungs when any node is blacklisted after RMrestart Key: YARN-1129 URL: https://issues.apache.org/jira/browse/YARN-1129 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora When RM restarted, if during restart one NM went bad (bad disk), NM got blacklisted by AM and RM keeps giving the containers on the same node even though AM doesn't want it there. Need to change AM to specifically blacklist node in the RM requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1057) Add mechanism to check validity of a Node to be Added/Excluded
[ https://issues.apache.org/jira/browse/YARN-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753992#comment-13753992 ] yeshavora commented on YARN-1057: - Hitesh, the usecase will be when invalid hosts are added in include/exclude file, Yarn should recognize them and does not add/remove node from cluster. Yarn should not print below message. INFO util.HostsFileReader (HostsFileReader.java:readFileToSet(68)) - Adding invalidhost.net to the list of included hosts from /tmp/yarn.include Ideally it should say something like java.net.UnknownHostException: invalidhost.net I believe RM shutdown is not needed as long as It can verify existence of a host. Add mechanism to check validity of a Node to be Added/Excluded -- Key: YARN-1057 URL: https://issues.apache.org/jira/browse/YARN-1057 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Xuan Gong Attachments: YARN-1057.1.patch Yarn does not complain while passing an invalid hostname like 'invalidhost.com' inside include/exclude node file. (specified by 'yarn.resourcemanager.nodes.include-path' or 'yarn.resourcemanager.nodes.exclude-path'). Need to add a mechanism to check the validity of the hostname before including or excluding from cluster. It should throw an error / exception while adding/removing an invalid node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1090) Job does not get into Pending State
yeshavora created YARN-1090: --- Summary: Job does not get into Pending State Key: YARN-1090 URL: https://issues.apache.org/jira/browse/YARN-1090 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora When there is no resource available to run a job, next job should go in pending state. RM UI should show next job as pending app and the counter for the pending app should be incremented. But Currently. Next job stays in ACCEPTED state and No AM has been assigned to this job.Though Pending App count is not incremented. Running 'job status nextjob' shows job state=PREP. $ mapred job -status job_1377122233385_0002 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at host1/ip1 Job: job_1377122233385_0002 Job File: /ABC/.staging/job_1377122233385_0002/job.xml Job Tracking URL : http://host1:port1/application_1377122233385_0002/ Uber job : false Number of maps: 0 Number of reduces: 0 map() completion: 0.0 reduce() completion: 0.0 Job state: PREP retired: false reason for failure: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1086) reducer of sort job restarts from scratch in between after RM restart
yeshavora created YARN-1086: --- Summary: reducer of sort job restarts from scratch in between after RM restart Key: YARN-1086 URL: https://issues.apache.org/jira/browse/YARN-1086 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Priority: Blocker Steps Followed: 1) Run a sort job. As soon as it finishes all the map tasks. [100% map], restart resource manager. 2) Analyse the progress of the sort job. It starts with 100% map 0% reduce 100% map 32% reduce 100% map 0% reduce Reducer stays at 30% reduce for around 5-10 minutes. and again start reducer from scratch. Log from failed reducer attempt: Error: java.io.IOException: Error while reading compressed data at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:174) at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:383) at org.apache.hadoop.mapred.IFile$Reader.nextRawValue(IFile.java:444) at org.apache.hadoop.mapred.Merger$Segment.nextRawValue(Merger.java:327) at org.apache.hadoop.mapred.Merger$Segment.getValue(Merger.java:309) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:533) at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:619) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:154) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:297) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:645) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:405) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) Caused by: org.apache.hadoop.fs.FSError: java.io.IOException: Input/output error at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:177) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:209) at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:152) at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:127) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:170) ... 17 more Caused by: java.io.IOException: Input/output error at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.read(RawLocalFileSystem.java:110) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:171) ... 26 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1087) Succeed job tries to restart after RMrestart
yeshavora created YARN-1087: --- Summary: Succeed job tries to restart after RMrestart Key: YARN-1087 URL: https://issues.apache.org/jira/browse/YARN-1087 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Priority: Blocker Run a job , restart RM when job just finished. It should not restart the job once it Succeed. After RM restart, The AM of restarted job fails with below error. AM log after Rmrestart: 013-08-19 17:29:21,144 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 0 2013-08-19 17:29:21,145 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop() 2013-08-19 17:29:21,146 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://host1:port1/user/ABC/.staging/job_1376933101704_0001 2013-08-19 17:29:21,156 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://host1:port1/ABC/.staging/job_1376933101704_0001/job.splitmetainfo at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1469) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1324) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1291) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:922) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:131) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1184) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:995) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1394) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1390) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1323) Caused by: java.io.FileNotFoundException: File does not exist: hdfs://host1:port1/ABC/.staging/job_1376933101704_0001/job.splitmetainfo at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1121) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1113) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:78) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1113) at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1464) ... 17 more 2013-08-19 17:29:21,158 INFO [Thread-2] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a signal. Signaling RMCommunicator and JobHistoryEventHandler. 2013-08-19 17:29:21,159 WARN [Thread-2] org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:805) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1344) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1083) ResourceManager should fail when yarn.nm.liveness-monitor.expiry-interval-ms is set less than heartbeat interval
yeshavora created YARN-1083: --- Summary: ResourceManager should fail when yarn.nm.liveness-monitor.expiry-interval-ms is set less than heartbeat interval Key: YARN-1083 URL: https://issues.apache.org/jira/browse/YARN-1083 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora if 'yarn.nm.liveness-monitor.expiry-interval-ms' is set to less than heartbeat iterval, all the node managers will be added in 'Lost Nodes' Instead, Resource Manager should validate these property and It should fail to start if combination of such property is invalid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1083) ResourceManager should fail when yarn.nm.liveness-monitor.expiry-interval-ms is set less than heartbeat interval
[ https://issues.apache.org/jira/browse/YARN-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora updated YARN-1083: Affects Version/s: 2.1.0-beta ResourceManager should fail when yarn.nm.liveness-monitor.expiry-interval-ms is set less than heartbeat interval Key: YARN-1083 URL: https://issues.apache.org/jira/browse/YARN-1083 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora if 'yarn.nm.liveness-monitor.expiry-interval-ms' is set to less than heartbeat iterval, all the node managers will be added in 'Lost Nodes' Instead, Resource Manager should validate these property and It should fail to start if combination of such property is invalid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1084) RM restart does not work for map only job
yeshavora created YARN-1084: --- Summary: RM restart does not work for map only job Key: YARN-1084 URL: https://issues.apache.org/jira/browse/YARN-1084 Project: Hadoop YARN Issue Type: Bug Reporter: yeshavora Map only job (randomwriter, randomtextwriter) restarts from scratch [0% map 0% reduce] after RM restart. It should resume from the last state when RM restarted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1057) Add mechanism to check validity of a Node to be Added/Excluded
yeshavora created YARN-1057: --- Summary: Add mechanism to check validity of a Node to be Added/Excluded Key: YARN-1057 URL: https://issues.apache.org/jira/browse/YARN-1057 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Yarn does not complain while passing an invalid hostname like 'invalidhost.com' inside include/exclude node file. (specified by 'yarn.resourcemanager.nodes.include-path' or 'yarn.resourcemanager.nodes.exclude-path'). Need to add a mechanism to check the validity of the hostname before including or excluding from cluster. It should throw an error / exception while adding/removing an invalid node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1057) Add mechanism to check validity of a Node to be Added/Excluded
[ https://issues.apache.org/jira/browse/YARN-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737030#comment-13737030 ] yeshavora commented on YARN-1057: - By 'Invalide hostname/node', I mean the incorrect name of the host/node which does not exists. Currently, Yarn just adds the invalid hostname without checking its existence. Like below. INFO util.HostsFileReader (HostsFileReader.java:readFileToSet(68)) - Adding invalidhost.net to the list of included hosts from Include_Yarn_File Yarn should firstly confirms host's existence and then include/exclude them. Add mechanism to check validity of a Node to be Added/Excluded -- Key: YARN-1057 URL: https://issues.apache.org/jira/browse/YARN-1057 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: yeshavora Yarn does not complain while passing an invalid hostname like 'invalidhost.com' inside include/exclude node file. (specified by 'yarn.resourcemanager.nodes.include-path' or 'yarn.resourcemanager.nodes.exclude-path'). Need to add a mechanism to check the validity of the hostname before including or excluding from cluster. It should throw an error / exception while adding/removing an invalid node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
[ https://issues.apache.org/jira/browse/YARN-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678174#comment-13678174 ] yeshavora commented on YARN-775: Setting yarn.nodemanager.delete.debug-delay-sec solves the issue. Closing this Jira stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
[ https://issues.apache.org/jira/browse/YARN-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora resolved YARN-775. Resolution: Invalid stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
yeshavora created YARN-775: -- Summary: stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
[ https://issues.apache.org/jira/browse/YARN-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677636#comment-13677636 ] yeshavora commented on YARN-775: I was not setting yarn.nodemanager.delete.debug-delay-sec to 0. I will retest with above property set to 0. And also submit nm and rm logs if reproduced. stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira