[
https://issues.apache.org/jira/browse/HDFS-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133245#comment-17133245
]
huhaiyang commented on HDFS-15391:
----------------------------------
{quote}
2020-06-04 18:32:11,561 ERROR
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception
on operation CloseOp [length=0, inodeId=0, path=xxxxpath, replication=3,
mtime=1591266620287, atime=1591264800229, blockSize=134217728,
blocks=[blk_11382006007_10353346830, blk_11382023760_10353365201,
blk_11382041307_10353383098, blk_11382049845_10353392031,
blk_11382057341_10353399899, blk_11382071544_10353415171,
blk_11382080753_10354157480], permissions=dw_water:rd:rw-r--r--,
aclEntries=null, clientName=, clientMachine=, overwrite=false,
storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_CLOSE, txid=126060943585]
java.io.IOException: File is not under construction: hdfs://xxxxpath
{quote}
Related edit log transactions
{noformat}
1. TXID=126060182153 OP_TRUNCATE time=1591266465492(2020-06-04 18:27:45)
NEWLENGTH=868460715
blocks: ...
<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>103364934</NUM_BYTES><GENSTAMP>10354049310</GENSTAMP>
2. TXID=126060182170 OP_REASSIGN_LEASE
3. TXID=126060308267 OP_CLOSE
<MTIME>1591266492080</MTIME> 2020-06-04 18:28:12 <ATIME>1591264800229</ATIME>
2020-06-04 18:00:00
blocks:
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>63154347</NUM_BYTES><GENSTAMP>10354049316</GENSTAMP>
4. TXID=126060311503 OP_APPEND
5. TXID=126060313001 OP_UPDATE_BLOCKS
blocks:
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>63154347</NUM_BYTES><GENSTAMP>10354071495</GENSTAMP>
6. TXID=126060921401 OP_REASSIGN_LEASE
7. TXID=126060942290 OP_CLOSE
<MTIME>1591266619003</MTIME> 2020-06-04 18:30:19 <ATIME>1591264800229</ATIME>
2020-06-04 18:00:00
blocks:
...<BLOCK_ID>{color:red}11382080753{color}</BLOCK_ID><NUM_BYTES>{color:red}63154347{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>
8.TXID=126060942548 OP_SET_GENSTAMP_V2
<GENSTAMPV2>{color:red}10354157480{color}</GENSTAMPV2>
9. TXID=126060942549 OP_TRUNCATE
<NEWLENGTH>868460715</NEWLENGTH>
<TIMESTAMP>1591266619207</TIMESTAMP> 2020-06-04 18:30:19
blocks:
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>{color:red}108764672{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>
10. TXID={color:red}126060943585{color} OP_CLOSE
<MTIME>1591266620287</MTIME>2020-06-04 18:30:20
<ATIME>1591264800229</ATIME>2020-06-04 18:00:00
blocks:
...<BLOCK_ID>11382080753</BLOCK_ID><NUM_BYTES>{color:red}63154347{color}</NUM_BYTES><GENSTAMP>{color:red}10354157480{color}</GENSTAMP>
{noformat}
> Standby NameNode due loads the corruption edit log, the service exits and
> cannot be restarted
> ---------------------------------------------------------------------------------------------
>
> Key: HDFS-15391
> URL: https://issues.apache.org/jira/browse/HDFS-15391
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 3.2.0
> Reporter: huhaiyang
> Priority: Critical
>
> In the cluster version 3.2.0 production environment,
> We found that due to edit log corruption, Standby NameNode could not
> properly load the Ediltog log, result in abnormal exit of the service and
> failure to restart
> {noformat}
> The specific scenario is that Flink writes to HDFS(replication file), and in
> the case of an exception to the write file, the following operations are
> performed :
> 1.close file
> 2.open file
> 3.truncate file
> 4.append file
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]