[
https://issues.apache.org/jira/browse/HDFS-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301992#comment-15301992
]
Hadoop QA commented on HDFS-6937:
---------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m
0s {color} | {color:green} The patch appears to include 4 new or modified test
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 33 new +
529 unchanged - 4 fixed = 562 total (was 533) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s
{color} | {color:red} The patch has 25 line(s) that end in whitespace. Use git
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 9s {color}
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
36s {color} | {color:green} Patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 56s {color}
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDatanodeDeath |
| | hadoop.hdfs.server.datanode.TestDiskError |
| | hadoop.hdfs.TestDFSClientExcludedNodes |
| | hadoop.hdfs.TestPipelineRecovery |
| | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
| | hadoop.hdfs.TestDecommissionWithStriped |
| | hadoop.hdfs.TestAsyncDFSRename |
| | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
| | hadoop.hdfs.tools.TestDebugAdmin |
| | hadoop.hdfs.TestFileAppend |
| Timed out junit tests |
org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12806312/HDFS-6937.001.patch |
| JIRA Issue | HDFS-6937 |
| Optional Tests | asflicense compile javac javadoc mvninstall mvnsite
unit findbugs checkstyle |
| uname | Linux cd955b145953 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh
|
| git revision | trunk / 77202fa |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
|
| whitespace |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/whitespace-eol.txt
|
| unit |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
| unit test logs |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U:
hadoop-hdfs-project/hadoop-hdfs |
| Console output |
https://builds.apache.org/job/PreCommit-HDFS-Build/15576/console |
| Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
> Another issue in handling checksum errors in write pipeline
> -----------------------------------------------------------
>
> Key: HDFS-6937
> URL: https://issues.apache.org/jira/browse/HDFS-6937
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, hdfs-client
> Affects Versions: 2.5.0
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Attachments: HDFS-6937.001.patch
>
>
> Given a write pipeline:
> DN1 -> DN2 -> DN3
> DN3 detected cheksum error and terminate, DN2 truncates its replica to the
> ACKed size. Then a new pipeline is attempted as
> DN1 -> DN2 -> DN4
> DN4 detects checksum error again. Later when replaced DN4 with DN5 (and so
> on), it failed for the same reason. This led to the observation that DN2's
> data is corrupted.
> Found that the software currently truncates DN2's replca to the ACKed size
> after DN3 terminates. But it doesn't check the correctness of the data
> already written to disk.
> So intuitively, a solution would be, when downstream DN (DN3 here) found
> checksum error, propagate this info back to upstream DN (DN2 here), DN2
> checks the correctness of the data already written to disk, and truncate the
> replica to to MIN(correctDataSize, ACKedSize).
> Found this issue is similar to what was reported by HDFS-3875, and the
> truncation at DN2 was actually introduced as part of the HDFS-3875 solution.
> Filing this jira for the issue reported here. HDFS-3875 was filed by
> [~tlipcon]
> and found he proposed something similar there.
> {quote}
> if the tail node in the pipeline detects a checksum error, then it returns a
> special error code back up the pipeline indicating this (rather than just
> disconnecting)
> if a non-tail node receives this error code, then it immediately scans its
> own block on disk (from the beginning up through the last acked length). If
> it detects a corruption on its local copy, then it should assume that it is
> the faulty one, rather than the downstream neighbor. If it detects no
> corruption, then the faulty node is either the downstream mirror or the
> network link between the two, and the current behavior is reasonable.
> {quote}
> Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]