subject:"\[jira\] \[Commented\] \(HDFS\-10714\) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-10-03 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543529#comment-15543529
 ] 

Yongjun Zhang commented on HDFS-10714:
--

Hi [~vinayrpet],

Thanks for your work here and sorry for the late reply. I agree with Kihwal's 
comment that "Guessing who is faulty is complicated".

The idea of the patch described at
https://issues.apache.org/jira/browse/HDFS-10714?focusedCommentId=15467559=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15467559
seems to be a reasonable approach, but questions about the following scenarios:

Initial pipeline DN1 -> DN2 -> DN3

Scenario 1. Detecting possibly network issue with DN2's output
1.1 DN3 reported checksum error
1.2 DN2 checked itself, and saw it's good
1.3 Treat DN3 as bad, DN3 is replaced with DN4
1.4 DN1 -> DN2 -> DN4
1.5 DN4 reports checksum error, 
1.6 Treat DN2 as bad
Question 1, do we allow DN3 and DN4 to be added back to be available DNs for 
later recovery? In theory we should.

Scenario 2. Detect data corruption at DN2 (this is like what's reported in 
HDFS-6937)
2.1 DN3 reported checksum error
2.2 DN2 checked itself, and saw it's bad, reports checksum error
2.3 DN1 checked itself, and saw it's good, 
2.4 treat DN2 as bad
Question 2, is this how it works? do we add back DN3 to be available for later 
recovery?

Scenario 3.
3.1 DN3 reported checksum error
3.2 DN2 checked itself, and saw it's bad
3.3 DN1 checked itself, and saw it's bad
3.4 treat DN1 as bad
Question 3, is this how it works? and do we have DN2 and DN3 available for use 
by later recovery?

Thanks.


> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-10-03 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542658#comment-15542658
 ] 

Kihwal Lee commented on HDFS-10714:
---

+1 on DN remembering what it did during a recovery and being more adaptive.

Unconditionally removing DN3 first might not be a good idea, since this is the 
only one did checksum verification. The data on this node upto the ACKed bytes 
is very likely good (it still could have wrong data on disk). In majority of 
cases I have analyzed in the past, this would hurt than help. Sure it might be 
at fault, but seems too harsh to remove it first.  Perhaps instead of 
statically removing one node, DN should perform further diagnostics.  The 
client could try different node ordering in the pipeline before removing any 
node. We could also add a feature to tell all DNs in the pipeline to do 
checksum verification in the middle of a block write (is per packet switch 
possible?). If the errors from these propagate properly to the client, it will 
be able to make a more informed decision and avoid blaming a wrong node.

Of course, this won't be perfect either. We also see checksum problems during 
dfs write stemming from faulty clients. The clients having OOM is the most 
common ones. These are irrecoverable. While we are at the subject of write 
pipelines, the transferBlock ops during replication is even worse since ACK is 
practically turned off. A node with a faulty NIC can do some damage there.  But 
that's outside the scope of this jira.

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-10-02 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541527#comment-15541527
 ] 

Vinayakumar B commented on HDFS-10714:
--

hi [~yzhangal],[~kihwal] 
Would it selected approach looks fine to you?

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-09-15 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494082#comment-15494082
 ] 

Vinayakumar B commented on HDFS-10714:
--

bq. In HDFS-6937 case, if DN3 gives ERROR_CHECKSUM error, DN3 will be replaced. 
But here DN2 got replaced. Would you please add some code snippet to explain 
how that happened? thanks.
At first DN3 only will be marked bad and replaced. And a reference will be kept 
DN2 as sender during previous checksum error. If checksum error found again in 
DN4 (which was replaced in place of DN3), then DN2 will be marked as BAD, 
provided DN2's local replica found valid in both times.

Here is the code snippet.
{code}
+  int currentBad = badNodeIndex;
+  /*
+   * When the checksum error found during transfer of packets, finding out
+   * the actual faulty node is tricky. So following below steps.
+   * 1. First remove the node which reported CHECKSUM error as bad.
+   *  and Keep track of it.
+   * 2. If second time CHECKSUM error reported and sender is same as
+   *  earlier, this time sender will be removed instead of the reporter.
+   */
+  if (checkSumError && badNodeIndex > 0) {
+if (prevChecksumErrorSenderNode != null) {
+  // If same node involved with second checksum error, then its clear
+  // that sender is the faulty node. 
+  if (prevChecksumErrorSenderNode.equals(nodes[badNodeIndex - 1])) {
+badNodeIndex = badNodeIndex - 1;
+errorState.setBadNodeIndex(badNodeIndex);
+prevChecksumErrorSenderNode = nodes[badNodeIndex - 1];
+LOG.warn("Bad node is changed to " + nodes[badNodeIndex]
++ " instead of " + nodes[currentBad]
++ " as this node caused checksum error in previous pipeline");
+  }
+} else {
+  prevChecksumErrorSenderNode = nodes[badNodeIndex - 1];
+  LOG.warn("Bad node is : " + nodes[badNodeIndex]);
+}
+  }
{code}

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-09-07 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471097#comment-15471097
 ] 

Yongjun Zhang commented on HDFS-10714:
--

Hi [~brahmareddy] and [~vinayrpet], Thanks for working on this. 

When visiting this issue (per discussion in HDFS-6937),
{quote}
DN1->DN2->DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
{quote}

In HDFS-6937 case, if DN3 gives ERROR_CHECKSUM error, DN3 will be replaced. But 
here DN2 got replaced. Would you please add some code snippet to explain how 
that happened? thanks.


> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-09-06 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469269#comment-15469269
 ] 

Vinayakumar B commented on HDFS-10714:
--

Current Patch contains some part of the HDFS-6937 patch.

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10714-01-draft.patch
>
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-09-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467787#comment-15467787
 ] 

Hadoop QA commented on HDFS-10714:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} hadoop-hdfs-project: The patch generated 5 new + 
167 unchanged - 2 fixed = 172 total (was 169) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
2s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.viewfs.TestViewFsHdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10714 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827198/HDFS-10714-01-draft.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b999d7a6cba6 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 62a9667 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16645/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16645/artifact/patchprocess/whitespace-eol.txt
 |
| unit |

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-08-02 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404103#comment-15404103
 ] 

Kihwal Lee commented on HDFS-10714:
---

Guessing who is faulty is complicated. Whatever works for a case can break 
other cases. It is not safe to remove the last guy, since it might be the only 
one that has valid data up to the ACKed bytes in certain cases. May be we 
should have the client examine datanodes individually before recreating a 
pipeline. 
- Have each datanode perform checksum validation up to the ACKed bytes (needs 
to be bytes-per-checksum aligned). 
- For identifying a bad node, directly write to N nodes for some number of 
packets. Exclude failed nodes and rebuild a pipeline.

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

2016-08-01 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403322#comment-15403322
 ] 

Brahma Reddy Battula commented on HDFS-10714:
-

Thinking solutions like this.

1) Remove both DNs in checksum error case..i.e DN2 and DN3

2) Remove DN3  first and record DN2 as suspect node .. If it still fails with 
checksum error , then  DN2 can be removed as it's suspected during next pipeline

I think, 2nd solution will be safe.. 

anythoughts on this...?  cc [~kanaka]/[~vinayrpet]

> Issue in handling checksum errors in write pipeline when fault DN is 
> LAST_IN_PIPELINE
> -
>
> Key: HDFS-10714
> URL: https://issues.apache.org/jira/browse/HDFS-10714
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> We had come across one issue, where write is failed even 7 DN’s are available 
> due to network fault at one datanode which is LAST_IN_PIPELINE. It will be 
> similar to HDFS-6937 .
> Scenario : (DN3 has N/W Fault and Min repl=2).
> Write pipeline:
> DN1->DN2->DN3  => DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad
> DN1->DN4-> DN3 => DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as bad
> ….
> And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no more 
> datanodes to construct the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

[jira] [Commented] (HDFS-10714) Issue in handling checksum errors in write pipeline when fault DN is LAST_IN_PIPELINE

9 matches

Site Navigation

Mail list logo

Footer information