chenwandong created HBASE-22641:
-----------------------------------
Summary: When the Region Server switches the WAL log, the new WAL
file created successfully but namenode returns message fails. Then the client
retry, but namenode return 'file has an exception', the Region Server does not
handle the exception, and abort itself.
Key: HBASE-22641
URL: https://issues.apache.org/jira/browse/HBASE-22641
Project: HBase
Issue Type: Bug
Affects Versions: 1.3.4
Reporter: chenwandong
Attachments: image-2019-06-27-21-12-29-757.png
!image-2019-06-27-21-12-29-757.png!
Problem Description
1. HBase's WAL log is full of 128M, switch to write a new WAL file. Region
server calls HDFS client to create a new WAL log file.
2. The HDFS client sends a CREATE message to the HDFS namenode through the RPC
channel.
3. Namenode checks and creates the file, and successfully records the metadata
of the new file.
4. At this time, because the namenode network flashed, the namenode failed to
respond to the Hdfs client.
5. Since the Hdfs client does not receive a response, wait for a while and try
again, and send the CREATE request again.
6. The Namenode detects the file that needs to be created already exists.
7. The Namenode returns an existing file exception (IOException) to the Hdfs
client.
8. After Hbase receives the returned exception, it does not handle it, and
abort Region server.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)