[ https://issues.apache.org/jira/browse/HADOOP-16049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747906#comment-16747906 ]
Steve Loughran commented on HADOOP-16049: ----------------------------------------- checkstyle {code} ./hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java:300: private static int readBytes(ThrottledInputStream inStream, byte buf[]):71: Array brackets at illegal position. {code} must be: {{byte[] buf}}} ASF license is from a crashed jvm {code} Lines that start with ????? in the ASF License report indicate files that do not have an Apache license header: !????? /testptch/hadoop/hadoop-tools/hadoop-distcp/hs_err_pid2744.log {code} Don't see anything in the tests resembling failures., though there was a timeout. There is a warning in the logs about azure storage versions, {code} [WARNING] Some problems were encountered while building the effective model for org.apache.hadoop:hadoop-distcp:jar:2.10.0-SNAPSHOT [WARNING] 'dependencyManagement.dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: com.microsoft.azure:azure-storage:jar -> version 7.0.0 vs 5.4.0 @ org.apache.hadoop:hadoop-project:2.10.0-SNAPSHOT, /testptch/hadoop/hadoop-project/pom.xml, line 1151, column 19 {code} How about you fix the checkstyle and resubmit it -we can see if that timeout was a transient error or not > DistCp result has data and checksum mismatch when blocks per chunk > 0 > ---------------------------------------------------------------------- > > Key: HADOOP-16049 > URL: https://issues.apache.org/jira/browse/HADOOP-16049 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp > Affects Versions: 2.9.2 > Reporter: Kai Xie > Assignee: Kai Xie > Priority: Major > Attachments: HADOOP-16049-branch-2-003.patch, > HADOOP-16049-branch-2-003.patch, HADOOP-16049-branch-2-004.patch > > > In 2.9.2 RetriableFileCopyCommand.copyBytes, > {code:java} > int bytesRead = readBytes(inStream, buf, sourceOffset); > while (bytesRead >= 0) { > ... > if (action == FileAction.APPEND) { > sourceOffset += bytesRead; > } > ... // write to dst > bytesRead = readBytes(inStream, buf, sourceOffset); > }{code} > it does a positioned read but the position (`sourceOffset` here) is never > updated when blocks per chunk is set to > 0 (which always disables append > action). So for chunk with offset != 0, it will keep copying the first few > bytes again and again, causing result to have data & checksum mismatch. > To re-produce this issue, in branch-2, update BLOCK_SIZE to 10240 (> default > copy buffer size) in class TestDistCpSystem and run it. > HADOOP-15292 has resolved the issue reported in this ticket in > trunk/branch-3.1/branch-3.2 by not using the positioned read, but has not > been backported to branch-2 yet > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org