[ 
https://issues.apache.org/jira/browse/HADOOP-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576482#action_12576482
 ] 

Koji Noguchi commented on HADOOP-2976:
--------------------------------------

Dhruba mentioned, 

"On lease expiry, the namenode should check each block for the file to 
determine if any of them are below their intended replication. If not, those 
blocks should be inserted into the neededReplication queue."

> Blocks staying underreplicated (for unclosed file)
> --------------------------------------------------
>
>                 Key: HADOOP-2976
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2976
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.3
>            Reporter: Koji Noguchi
>            Priority: Minor
>             Fix For: 0.17.0
>
>
> We had two files staying underreplicated for over a day.
> I checked that these under-replicated blocks are not corrupted.
> (They were both task tmp files and most likely didn't get closed.)
> Taking one file, /aaa/_task_200803040823_0001_r_000421_0/part-00421
> Namenode log showed
> namenode.log.2008-03-04 2008-03-04 16:19:21,478 INFO 
> org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: 
> /aaa/_task_200803040823_0001_r_000421_0/part-00421.  blk_-7848645760735416126
> 2008-03-04 16:19:24,357 INFO org.apache.hadoop.dfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 11.1.111.111:22222 is added to 
> blk_-7848645760735416126
> On the datanode 11.1.111.111, it showed 
> 2008-03-04 16:19:24,358 INFO org.apache.hadoop.dfs.DataNode: Received block 
> blk_-7848645760735416126 from /55.55.55.55 and operation failed at 
> /22.2.222.22
> On the second datanode 22.2.222.22, it showed 
> 2008-03-04 16:19:21,578 INFO org.apache.hadoop.dfs.DataNode: Exception 
> writing to mirror 33.3.33.33
> java.net.SocketException: Connection reset
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>   at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>   at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>   at java.io.DataOutputStream.write(DataOutputStream.java:90)
>   at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveChunk(DataNode.java:1333)
>   at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:1386)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:938)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:804)
>   at java.lang.Thread.run(Thread.java:619)
> 2008-03-04 16:19:24,358 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: 
> java.net.SocketException: Broken pipe
>   at java.net.SocketOutputStream.socketWrite0(Native Method)
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>   at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>   at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>   at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>   at 
> org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:1394)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:938)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:804)
>   at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to