[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2022-10-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620042#comment-17620042
 ] 

Bilwa S T commented on HADOOP-17763:


I will add a testcase for this. Meanwhile [~epayne] can you please help review 
this patch? Thanks

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, 
> HADOOP-17763.003.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-09-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420115#comment-17420115
 ] 

Ayush Saxena commented on HADOOP-17763:
---

Can you extend a test as well for this case?

 

[~epayne] can you have a look as well? If that is the correct thing to do from 
MapRed/Yarn perspective

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, 
> HADOOP-17763.003.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-09-21 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418003#comment-17418003
 ] 

Hadoop QA commented on HADOOP-17763:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
52s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 16s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 
48s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 51s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| 

[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-09-21 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417942#comment-17417942
 ] 

Bilwa S T commented on HADOOP-17763:


cc [~ayushtkn] [~epayne][~smajeti]

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch, 
> HADOOP-17763.003.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-09-20 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417690#comment-17417690
 ] 

Hadoop QA commented on HADOOP-17763:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 
28s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 11s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m  
0s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/239/artifact/out/diff-checkstyle-hadoop-tools_hadoop-distcp.txt{color}
 | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 1 new + 13 
unchanged - 0 fixed = 14 total (was 13) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  0s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| 

[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-08-23 Thread Niitsh Khanna (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403099#comment-17403099
 ] 

Niitsh Khanna commented on HADOOP-17763:


Hi Team,

Any ETA on this fix so that we can share the Patch with customer as they are 
facing this issue.

Regards

Nitish

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-08-23 Thread Niitsh Khanna (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403072#comment-17403072
 ] 

Niitsh Khanna commented on HADOOP-17763:


Hi Team,

Any ETA for the PATCH of this issue as some of the customers are facing this 
and they would definitely ask for this fix.

Regards

Nitish

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-08-02 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391964#comment-17391964
 ] 

Bilwa S T commented on HADOOP-17763:


Hi [~smajeti]
I will handle checkstyle and javadoc issues. There is nothing pending from code 
changes

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-08-02 Thread Srinivasu Majeti (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391644#comment-17391644
 ] 

Srinivasu Majeti commented on HADOOP-17763:
---

HI [~BilwaST] , Is there something pending for committing this fix ? 

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-07-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389286#comment-17389286
 ] 

Hadoop QA commented on HADOOP-17763:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m  
6s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 
51s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 33s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 
20s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  0m 
49s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/219/artifact/out/diff-checkstyle-hadoop-tools_hadoop-distcp.txt{color}
 | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 1 new + 13 
unchanged - 0 fixed = 14 total (was 13) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| 

[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-07-28 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389217#comment-17389217
 ] 

Bilwa S T commented on HADOOP-17763:


Hi [~ayushtkn] [~epayne]
can you please take a look at updated patch ?

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch, HADOOP-17763.002.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-27 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370203#comment-17370203
 ] 

Bilwa S T commented on HADOOP-17763:


sorry for the confusion. It doesn't delete entire staging directory. it deletes 
folder created inside staging directory as metafolder which gets deleted.
The folder which is set to DistCpConstants.CONF_LABEL_META_FOLDER gets deleted. 
{code:java}
private Path createMetaFolderPath() throws Exception {
Configuration configuration = getConf();
Path stagingDir = JobSubmissionFiles.getStagingDir(
new Cluster(configuration), configuration);
Path metaFolderPath = new Path(stagingDir, PREFIX + 
String.valueOf(rand.nextInt()));
if (LOG.isDebugEnabled())
  LOG.debug("Meta folder location: " + metaFolderPath);
configuration.set(DistCpConstants.CONF_LABEL_META_FOLDER, 
metaFolderPath.toString());
return metaFolderPath;
  }
{code}

This is not same for all mapreduce jobs. In case of other MR jobs only the 
output dir gets deleted on AM restart. I will attach a patch


> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-26 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369833#comment-17369833
 ] 

Ayush Saxena commented on HADOOP-17763:
---

{quote}Tasks fails as we use staging directory to store split files and this 
same directory gets deleted whenever AM relaunches. So we should avoid storing 
split files in staging directory like other mapreduce applications. 
{quote}
Not very sure how MapReduce works in case of AM failure. I thought you said the 
entire staging directory gets deleted in case the AM is aborted. So, in that 
case how 'a new folder inside staging directory itself' will be saved from 
deletion?

 

But, if things works as you said, feel free to go ahead updating the patch. :) 

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-24 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368816#comment-17368816
 ] 

Bilwa S T commented on HADOOP-17763:


[~ayushtkn] I think we can store it in a new folder inside staging directory 
itself and can delete it once job completes. This way we can avoid any other 
issues and also this file will be cleaned up on job completion. What do you 
think?

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364688#comment-17364688
 ] 

Ayush Saxena commented on HADOOP-17763:
---

{quote} String fileListPathStr = context.getTargetPath() + 
"/fileList.seq";{quote}

You are storing here in the target path. In case the delete missing option is 
specified CopyCommitter#deleteMissing might delete this file as well?
Moreover, we have to explicitly manage the clean up of this file in all cases. 
In case of snapshot based distcp. this would even modify the target directory, 
and can potentially lead to inconsistency.

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-16 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364335#comment-17364335
 ] 

Bilwa S T commented on HADOOP-17763:


Hi [~ayushtkn] Thanks for your review comments.

by default behaviour you mean using staging directory? No it won't solve this 
issue as we use CopyCommitter for distcp and CopyCommitter cleans up the dir.
{quote}Thoughts on having a custom option for MetaFolderPath, if this is 
specified it can be used else the default behaviour? Will that solve your 
purpose?
{quote}

I think we just need sequence files so instead of changing the metafolder path 
we can just use input path passed by user for fileListPath in 
DistCp#getFileListingPath. In this case we would not have any issue with 
permissions. What do you say?


> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364292#comment-17364292
 ] 

Hadoop QA commented on HADOOP-17763:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
46s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
47s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 56s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 
19s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m  
5s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} hadoop-tools/hadoop-distcp: 
The patch generated 0 new + 15 unchanged - 1 fixed = 15 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 41s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch 

[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364247#comment-17364247
 ] 

Ayush Saxena commented on HADOOP-17763:
---

Thanx [~BilwaST] for the report and fix.

{code:java}
-metaFolder = createMetaFolderPath();
{code}
I think this removal from {{createAndSubmitJob}} is gonna fetch you a bunch of 
test failures due to NPE. Moreover that is a public method, we can let it stay 
there as is and we can only tackle the path stuff here.

The new path:

{code:java}
+Path metaFolderPath = new Path(DistCp.class.getSimpleName() + "_"
++ System.currentTimeMillis() + "_" + rand.nextInt(Integer.MAX_VALUE));
{code}

I am not sure if changing the path directly from the staging directory to 
something in user home directory will be a safe move from compatibility point 
of view. If the user has some quotas set or has some certain permissions, or in 
case of federation, which prevents creation of a file there, we would land up 
in a mess. 
Thoughts on having a custom option for {{MetaFolderPath}}, if this is specified 
it can be used else the default behaviour? Will that solve your purpose?



> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

2021-06-15 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364072#comment-17364072
 ] 

Bilwa S T commented on HADOOP-17763:


Tasks fails as we use staging directory to store split files and this same 
directory gets deleted whenever AM relaunches. So we should avoid storing split 
files in staging directory like other mapreduce applications.

> DistCp job fails when AM is killed
> --
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_00_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org