[ 
https://issues.apache.org/jira/browse/HADOOP-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597279#action_12597279
 ] 

Raghu Angadi commented on HADOOP-3035:
--------------------------------------

+1. A few minor comments :

- Unit test: our normal approach is to wait in a loop and in each iteration, 
wait for shorter time (500 millisec) in each iteration. So normally test 
finishes faster and will be able to handle platform related unexpected (and 
unavoidable) delays.
- The test does not belong to TestDatadndeBlockScanner.
- you could log before invoking reportBadBlocks().


> Data nodes should inform the name-node about block crc errors.
> --------------------------------------------------------------
>
>                 Key: HADOOP-3035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: lohit vijayarenu
>         Attachments: HADOOP-3035-1.patch, HADOOP-3035-2.patch
>
>
> Currently if a crc error occurs when data-node replicates a block to another 
> node it throws an exception, and continues.
> {code}
>     [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode 
> (DataNode.java:transferBlocks(811)) - 127.0.0.1:3730 Starting thread to 
> transfer block blk_-1962819020391742554 to 127.0.0.1:3740
>     [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode 
> (DataNode.java:writeBlock(1067)) - Receiving block blk_-1962819020391742554 
> src: /127.0.0.1:3791 dest: /127.0.0.1:3740
>     [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode 
> (DataNode.java:receiveBlock(2504)) - Exception in receiveBlock for block 
> blk_-1962819020391742554 java.io.IOException: Unexpected checksum mismatch 
> while writing blk_-1962819020391742554 from /127.0.0.1
>     [junit] 2008-03-17 19:46:11,871 INFO  dfs.DataNode 
> (DataNode.java:run(2626)) - 127.0.0.1:3730:Transmitted block 
> blk_-1962819020391742554 to /127.0.0.1:3740
>     [junit] 2008-03-17 19:46:11,871 INFO  dfs.DataNode 
> (DataNode.java:writeBlock(1192)) - writeBlock blk_-1962819020391742554 
> received exception java.io.IOException: Unexpected checksum mismatch while 
> writing blk_-1962819020391742554 from /127.0.0.1
>     [junit] 2008-03-17 19:46:11,871 ERROR dfs.DataNode 
> (DataNode.java:run(979)) - 127.0.0.1:3740:DataXceiver: java.io.IOException: 
> Unexpected checksum mismatch while writing blk_-1962819020391742554 from 
> /127.0.0.1
>     [junit]     at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveChunk(DataNode.java:2246)
>     [junit]     at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receivePacket(DataNode.java:2416)
>     [junit]     at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:2474)
>     [junit]     at 
> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1173)
>     [junit]     at 
> org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:956)
>     [junit]     at java.lang.Thread.run(Thread.java:595)
> {code}
> The data-node should report the error to the name-node so that the corrupted 
> replica could be removed and replicated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to