[
https://issues.apache.org/jira/browse/HBASE-22641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877590#comment-16877590
]
Brahma Reddy Battula commented on HBASE-22641:
----------------------------------------------
{quote}6. The Namenode detects the file that needs to be created already exists.
{quote}
IMO, this might not be expected to be happen.(unless retry cache is disabled
(dfs.namenode.enable.retrycache) *OR* if same request come after cache
expiry(which can be configured by default 10 min))
We do maintain retrycache for *non-idempotent* operation with callID+clientID,
so we return response from the cache. For following two cases.
1. A client makes a request. The operation is complete on the namenode. Client
does not get the response. Retries the request.
2.A client makes a request. The operation is still in progress on the namenode.
Client gets disconnected for some reason. Retries the request.
> When the Region Server switches the WAL log, the new WAL file created
> successfully but namenode returns message fails. Then the client retry, but
> namenode return 'file has an exception', the Region Server does not handle
> the exception, and abort itself.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-22641
> URL: https://issues.apache.org/jira/browse/HBASE-22641
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.3.4
> Reporter: chenwandong
> Priority: Major
> Attachments: image-2019-06-27-21-12-29-757.png
>
>
> !image-2019-06-27-21-12-29-757.png!
> Problem Description
> 1. HBase's WAL log is full of 128M, switch to write a new WAL file. Region
> server calls HDFS client to create a new WAL log file.
> 2. The HDFS client sends a CREATE message to the HDFS namenode through the
> RPC channel.
> 3. Namenode checks and creates the file, and successfully records the
> metadata of the new file.
> 4. At this time, because the namenode network flashed, the namenode failed to
> respond to the Hdfs client.
> 5. Since the Hdfs client does not receive a response, wait for a while and
> try again, and send the CREATE request again.
> 6. The Namenode detects the file that needs to be created already exists.
> 7. The Namenode returns an existing file exception (IOException) to the Hdfs
> client.
> 8. After Hbase receives the returned exception, it does not handle it, and
> abort Region server.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)