[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620042#comment-17620042 ] Bilwa S T commented on HADOOP-17763: I will add a testcase for this. Meanwhile [~epayne] can you please help review this patch? Thanks > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, > HADOOP-17763.003.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420115#comment-17420115 ] Ayush Saxena commented on HADOOP-17763: --- Can you extend a test as well for this case? [~epayne] can you have a look as well? If that is the correct thing to do from MapRed/Yarn perspective > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, > HADOOP-17763.003.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418003#comment-17418003 ] Hadoop QA commented on HADOOP-17763: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 52s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 16s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 48s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 51s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | |
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417942#comment-17417942 ] Bilwa S T commented on HADOOP-17763: cc [~ayushtkn] [~epayne][~smajeti] > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, > HADOOP-17763.003.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417690#comment-17417690 ] Hadoop QA commented on HADOOP-17763: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 28s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 0s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 16s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/239/artifact/out/diff-checkstyle-hadoop-tools_hadoop-distcp.txt{color} | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 0s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | |
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403099#comment-17403099 ] Niitsh Khanna commented on HADOOP-17763: Hi Team, Any ETA on this fix so that we can share the Patch with customer as they are facing this issue. Regards Nitish > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403072#comment-17403072 ] Niitsh Khanna commented on HADOOP-17763: Hi Team, Any ETA for the PATCH of this issue as some of the customers are facing this and they would definitely ask for this fix. Regards Nitish > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391964#comment-17391964 ] Bilwa S T commented on HADOOP-17763: Hi [~smajeti] I will handle checkstyle and javadoc issues. There is nothing pending from code changes > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391644#comment-17391644 ] Srinivasu Majeti commented on HADOOP-17763: --- HI [~BilwaST] , Is there something pending for committing this fix ? > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389286#comment-17389286 ] Hadoop QA commented on HADOOP-17763: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 6s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 33s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 20s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 17s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/219/artifact/out/diff-checkstyle-hadoop-tools_hadoop-distcp.txt{color} | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 49s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | |
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389217#comment-17389217 ] Bilwa S T commented on HADOOP-17763: Hi [~ayushtkn] [~epayne] can you please take a look at updated patch ? > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370203#comment-17370203 ] Bilwa S T commented on HADOOP-17763: sorry for the confusion. It doesn't delete entire staging directory. it deletes folder created inside staging directory as metafolder which gets deleted. The folder which is set to DistCpConstants.CONF_LABEL_META_FOLDER gets deleted. {code:java} private Path createMetaFolderPath() throws Exception { Configuration configuration = getConf(); Path stagingDir = JobSubmissionFiles.getStagingDir( new Cluster(configuration), configuration); Path metaFolderPath = new Path(stagingDir, PREFIX + String.valueOf(rand.nextInt())); if (LOG.isDebugEnabled()) LOG.debug("Meta folder location: " + metaFolderPath); configuration.set(DistCpConstants.CONF_LABEL_META_FOLDER, metaFolderPath.toString()); return metaFolderPath; } {code} This is not same for all mapreduce jobs. In case of other MR jobs only the output dir gets deleted on AM restart. I will attach a patch > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369833#comment-17369833 ] Ayush Saxena commented on HADOOP-17763: --- {quote}Tasks fails as we use staging directory to store split files and this same directory gets deleted whenever AM relaunches. So we should avoid storing split files in staging directory like other mapreduce applications. {quote} Not very sure how MapReduce works in case of AM failure. I thought you said the entire staging directory gets deleted in case the AM is aborted. So, in that case how 'a new folder inside staging directory itself' will be saved from deletion? But, if things works as you said, feel free to go ahead updating the patch. :) > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368816#comment-17368816 ] Bilwa S T commented on HADOOP-17763: [~ayushtkn] I think we can store it in a new folder inside staging directory itself and can delete it once job completes. This way we can avoid any other issues and also this file will be cleaned up on job completion. What do you think? > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364688#comment-17364688 ] Ayush Saxena commented on HADOOP-17763: --- {quote} String fileListPathStr = context.getTargetPath() + "/fileList.seq";{quote} You are storing here in the target path. In case the delete missing option is specified CopyCommitter#deleteMissing might delete this file as well? Moreover, we have to explicitly manage the clean up of this file in all cases. In case of snapshot based distcp. this would even modify the target directory, and can potentially lead to inconsistency. > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364335#comment-17364335 ] Bilwa S T commented on HADOOP-17763: Hi [~ayushtkn] Thanks for your review comments. by default behaviour you mean using staging directory? No it won't solve this issue as we use CopyCommitter for distcp and CopyCommitter cleans up the dir. {quote}Thoughts on having a custom option for MetaFolderPath, if this is specified it can be used else the default behaviour? Will that solve your purpose? {quote} I think we just need sequence files so instead of changing the metafolder path we can just use input path passed by user for fileListPath in DistCp#getFileListingPath. In this case we would not have any issue with permissions. What do you say? > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364292#comment-17364292 ] Hadoop QA commented on HADOOP-17763: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 46s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 47s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 56s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 19s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} hadoop-tools/hadoop-distcp: The patch generated 0 new + 15 unchanged - 1 fixed = 15 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 41s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364247#comment-17364247 ] Ayush Saxena commented on HADOOP-17763: --- Thanx [~BilwaST] for the report and fix. {code:java} -metaFolder = createMetaFolderPath(); {code} I think this removal from {{createAndSubmitJob}} is gonna fetch you a bunch of test failures due to NPE. Moreover that is a public method, we can let it stay there as is and we can only tackle the path stuff here. The new path: {code:java} +Path metaFolderPath = new Path(DistCp.class.getSimpleName() + "_" ++ System.currentTimeMillis() + "_" + rand.nextInt(Integer.MAX_VALUE)); {code} I am not sure if changing the path directly from the staging directory to something in user home directory will be a safe move from compatibility point of view. If the user has some quotas set or has some certain permissions, or in case of federation, which prevents creation of a file there, we would land up in a mess. Thoughts on having a custom option for {{MetaFolderPath}}, if this is specified it can be used else the default behaviour? Will that solve your purpose? > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: HADOOP-17763.001.patch > > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed
[ https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364072#comment-17364072 ] Bilwa S T commented on HADOOP-17763: Tasks fails as we use staging directory to store split files and this same directory gets deleted whenever AM relaunches. So we should avoid storing split files in staging directory like other mapreduce applications. > DistCp job fails when AM is killed > -- > > Key: HADOOP-17763 > URL: https://issues.apache.org/jira/browse/HADOOP-17763 > Project: Hadoop Common > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > Job fails as tasks fail with below exception > {code:java} > 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: > attempt_1623387358383_0006_m_00_1000 - exited : > java.io.FileNotFoundException: File does not exist: > hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886) > at > org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177) > | TaskAttemptListenerImpl.java:304{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org