[jira] [Commented] (HDFS-9289) check genStamp when complete file

Chang Li (JIRA) Sat, 24 Oct 2015 08:40:07 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972655#comment-14972655
 ]


Chang Li commented on HDFS-9289:
--------------------------------

Hi [~zhz], here is the log,
{code}
INFO hdfs.StateChange: BLOCK* allocateBlock: 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028.
 BP-1052427332-98.138.108.146-1350583571998 
blk_3773617405_1106111498065{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-0a28b82a-e3fb-4e42-b925-e76ebd98afb4:NORMAL:10.216.32.61:1004|RBW],
 
ReplicaUnderConstruction[[DISK]DS-236c19ee-0a39-4e53-9520-c32941ca1828:NORMAL:10.216.70.49:1004|RBW],
 
ReplicaUnderConstruction[[DISK]DS-fc7c2dab-9309-46be-b5c0-52be8e698591:NORMAL:10.216.70.43:1004|RBW]]}
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(block=BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065,
 newGenerationStamp=1106111511603, newLength=107761275, 
newNodes=[10.216.70.49:1004, 10.216.70.43:1004], 
clientName=DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1)
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065)
 successfully to 
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111511603
2015-10-20 19:49:20,400 [IPC Server handler 96 on 8020] INFO hdfs.StateChange: 
DIR* completeFile: 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028
 is closed by DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1
{code}
You can see the file complete after a pipeline update. The block changed its 
genStamp from blk_3773617405_1106111498065 to blk_3773617405_1106111511603. But 
then the two nodes in the updated pipeline are marked as corrupted. When I do 
fsck, it shows 
{code} 
hdfs fsck 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
Connecting to namenode via http://uraniumtan-nn1.tan.ygrid.yahoo.com:50070
FSCK started by hdfs (auth:KERBEROS_SSL) from /98.138.131.190 for path 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
 at Wed Oct 21 15:04:56 UTC 2015
.
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
 CORRUPT blockpool BP-1052427332-98.138.108.146-1350583571998 block 
blk_3773617405

/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
  Replica placement policy is violated for 
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065. Block 
should be additionally replicated on 1 more rack(s).
{code}
it shows the blk with old gen stamp blk_3773617405_1106111498065.

> check genStamp when complete file
> ---------------------------------
>
>                 Key: HDFS-9289
>                 URL: https://issues.apache.org/jira/browse/HDFS-9289
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Chang Li
>            Assignee: Chang Li
>            Priority: Critical
>         Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

Reply via email to