[ 
https://issues.apache.org/jira/browse/HDFS-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133245#comment-17133245
 ] 

huhaiyang commented on HDFS-15391:
----------------------------------

{quote}
2020-06-04 18:32:11,561 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
on operation CloseOp [length=0, inodeId=0, path=xxxxpath, replication=3, 
mtime=1591266620287, atime=1591264800229, blockSize=134217728, 
blocks=[blk_11382006007_10353346830, blk_11382023760_10353365201, 
blk_11382041307_10353383098, blk_11382049845_10353392031, 
blk_11382057341_10353399899, blk_11382071544_10353415171, 
blk_11382080753_10354157480], permissions=dw_water:rd:rw-r--r--, 
aclEntries=null, clientName=, clientMachine=, overwrite=false, 
storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_CLOSE, txid=126060943585]
 java.io.IOException: File is not under construction: hdfs://xxxxpath
{quote}
Related edit log transactions 

{noformat}
1. TXID=126060182153 OP_TRUNCATE time=1591266465492(2020-06-04 18:27:45)

NEWLENGTH=868460715
blocks: ... 
<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>103364934</NUM_BYTES><GENSTAMP>10354049310</GENSTAMP>

2. TXID=126060182170 OP_REASSIGN_LEASE

3. TXID=126060308267 OP_CLOSE
<MTIME>1591266492080</MTIME> 2020-06-04 18:28:12 <ATIME>1591264800229</ATIME> 
2020-06-04 18:00:00
blocks: 
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>63154347</NUM_BYTES><GENSTAMP>10354049316</GENSTAMP>

4. TXID=126060311503 OP_APPEND

5. TXID=126060313001 OP_UPDATE_BLOCKS
blocks: 
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>63154347</NUM_BYTES><GENSTAMP>10354071495</GENSTAMP>

6. TXID=126060921401 OP_REASSIGN_LEASE

7. TXID=126060942290 OP_CLOSE
<MTIME>1591266619003</MTIME> 2020-06-04 18:30:19 <ATIME>1591264800229</ATIME> 
2020-06-04 18:00:00
blocks: 
...<BLOCK_ID>{color:red}11382080753{color}</BLOCK_ID><NUM_BYTES>{color:red}63154347{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>

8.TXID=126060942548 OP_SET_GENSTAMP_V2

<GENSTAMPV2>{color:red}10354157480{color}</GENSTAMPV2>

9. TXID=126060942549 OP_TRUNCATE
<NEWLENGTH>868460715</NEWLENGTH>
<TIMESTAMP>1591266619207</TIMESTAMP> 2020-06-04 18:30:19
blocks: 
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>{color:red}108764672{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>

10. TXID={color:red}126060943585{color} OP_CLOSE
<MTIME>1591266620287</MTIME>2020-06-04 18:30:20 
<ATIME>1591264800229</ATIME>2020-06-04 18:00:00
blocks: 
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>{color:red}63154347{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>
{noformat}




> Standby NameNode due loads the corruption edit log, the service exits and 
> cannot be restarted
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15391
>                 URL: https://issues.apache.org/jira/browse/HDFS-15391
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.2.0
>            Reporter: huhaiyang
>            Priority: Critical
>
> In the cluster version 3.2.0 production environment,
>  We found that due to edit log corruption, Standby NameNode could not 
> properly load the Ediltog log, result in abnormal exit of the service and 
> failure to restart
> {noformat}
> The specific scenario is that Flink writes to HDFS(replication file), and in 
> the case of an exception to the write file, the following operations are 
> performed :
> 1.close file
> 2.open file
> 3.truncate file
> 4.append file
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to