[
https://issues.apache.org/jira/browse/HDFS-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876509#comment-14876509
]
Hadoop QA commented on HDFS-9106:
---------------------------------
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch | 18m 4s | Pre-patch trunk compilation is
healthy. |
| {color:green}+1{color} | @author | 0m 0s | The patch does not contain any
@author tags. |
| {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear
to include any new or modified tests. Please justify why no new tests are
needed for this patch. Also please list what manual steps were performed to
verify this patch. |
| {color:green}+1{color} | javac | 8m 4s | There were no new javac warning
messages. |
| {color:green}+1{color} | javadoc | 10m 21s | There were no new javadoc
warning messages. |
| {color:green}+1{color} | release audit | 0m 23s | The applied patch does
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle | 1m 23s | The applied patch generated 1
new checkstyle issues (total was 82, now 83). |
| {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that
end in whitespace. |
| {color:green}+1{color} | install | 1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with
eclipse:eclipse. |
| {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native | 3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 192m 12s | Tests failed in hadoop-hdfs. |
| | | 238m 24s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
| | hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL |
http://issues.apache.org/jira/secure/attachment/12761154/HDFS-9106-poc.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 88d89267 |
| checkstyle |
https://builds.apache.org/job/PreCommit-HDFS-Build/12544/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
|
| hadoop-hdfs test log |
https://builds.apache.org/job/PreCommit-HDFS-Build/12544/artifact/patchprocess/testrun_hadoop-hdfs.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HDFS-Build/12544/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output |
https://builds.apache.org/job/PreCommit-HDFS-Build/12544/console |
This message was automatically generated.
> Transfer failure during pipeline recovery causes permanent write failures
> -------------------------------------------------------------------------
>
> Key: HDFS-9106
> URL: https://issues.apache.org/jira/browse/HDFS-9106
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Attachments: HDFS-9106-poc.patch
>
>
> When a new node is added to a write pipeline during flush/sync, if the
> partial block transfer fails, the write will fail permanently without
> retrying or continuing with whatever is in the pipeline.
> The transfer often fails in busy clusters due to timeout. There is no
> per-packet ACK between client and datanode or between source and target
> datanodes. If the total transfer time exceeds the configured timeout + 10
> seconds (2 * 5 seconds slack), it is considered failed. Naturally, the
> failure rate is higher with bigger block sizes.
> I propose following changes:
> - Transfer timeout needs to be different from per-packet timeout.
> - transfer should be retried if fails.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)