[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972655#comment-14972655
]
Chang Li commented on HDFS-9289:
--------------------------------
Hi [~zhz], here is the log,
{code}
INFO hdfs.StateChange: BLOCK* allocateBlock:
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028.
BP-1052427332-98.138.108.146-1350583571998
blk_3773617405_1106111498065{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[[DISK]DS-0a28b82a-e3fb-4e42-b925-e76ebd98afb4:NORMAL:10.216.32.61:1004|RBW],
ReplicaUnderConstruction[[DISK]DS-236c19ee-0a39-4e53-9520-c32941ca1828:NORMAL:10.216.70.49:1004|RBW],
ReplicaUnderConstruction[[DISK]DS-fc7c2dab-9309-46be-b5c0-52be8e698591:NORMAL:10.216.70.43:1004|RBW]]}
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO
namenode.FSNamesystem:
updatePipeline(block=BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065,
newGenerationStamp=1106111511603, newLength=107761275,
newNodes=[10.216.70.49:1004, 10.216.70.43:1004],
clientName=DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1)
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO
namenode.FSNamesystem:
updatePipeline(BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065)
successfully to
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111511603
2015-10-20 19:49:20,400 [IPC Server handler 96 on 8020] INFO hdfs.StateChange:
DIR* completeFile:
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028
is closed by DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1
{code}
You can see the file complete after a pipeline update. The block changed its
genStamp from blk_3773617405_1106111498065 to blk_3773617405_1106111511603. But
then the two nodes in the updated pipeline are marked as corrupted. When I do
fsck, it shows
{code}
hdfs fsck
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
Connecting to namenode via http://uraniumtan-nn1.tan.ygrid.yahoo.com:50070
FSCK started by hdfs (auth:KERBEROS_SSL) from /98.138.131.190 for path
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
at Wed Oct 21 15:04:56 UTC 2015
.
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
CORRUPT blockpool BP-1052427332-98.138.108.146-1350583571998 block
blk_3773617405
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
Replica placement policy is violated for
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065. Block
should be additionally replicated on 1 more rack(s).
{code}
it shows the blk with old gen stamp blk_3773617405_1106111498065.
> check genStamp when complete file
> ---------------------------------
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Chang Li
> Assignee: Chang Li
> Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a
> pipelineUpdate, but the file complete with the old block genStamp. This
> caused the replicas of two datanodes in updated pipeline to be viewed as
> corrupte. Propose to check genstamp when commit block
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)