Dong0829 created HBASE-27614:
--------------------------------
Summary: Region Reopen failure when the openNum has issue
Key: HBASE-27614
URL: https://issues.apache.org/jira/browse/HBASE-27614
Project: HBase
Issue Type: Bug
Reporter: Dong0829
Assignee: Dong0829
We faced the issue when change the TTL for the hbase table and a lot of regions
keep reopen and tons of TRSP created, after troubleshooting, we found some
issue for the region reopen procedure logic.
In the reopen process, it will check the seqNum to confirm if the region
reopened successfully or not. If the seqNum accident become bigger than the
current HFile and WAL (because of the data loss), there will be issue and
unnecessary loop for the region close/open
We should be able to optimize the logic, more details
For this regionOpenedWithoutPersistingToMeta, should we just update the
OpenSeqNum when the new one is bigger than the old one?
As the region already opened, we should update the OpenSeqNum no matter its
bigger or smaller, otherwise, we should not just return WARN but failed the
open, right?
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/OpenRegionProcedure.java#L81]
Above does matter because for the
checkReopened([https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java#L312]),
if the seq is smaller, the region will be returned and keep reopening. So we
should either update the logic in regionOpenedWithoutPersistingToMeta or
checkReopened to make sure the region reopen works properly if the seqNum has
issue
Reproduce steps:
1. {{{}Create a test table and put some data, for example:{}}}{{{}test{}}}
{{create 'test', 'info'}}
{{put 'test', 'fool', 'info:cat', 'test'}}
{{2. Manually update one region row for this test table in hbase:meta on the
column, for example:}}
{{put 'hbase:meta', 'test,,1673406566311.3eb4d3e0258bd06f4639a595920c7673.',
'info:seqnumDuringOpen', "\x00\x00\x00\x00\x00\x10\x00\x05"}}
{{3. Modify the table TTL :
alter 'test', \{NAME=>'info' , TTL => '63244800'}}}
{{}}
You will see the region keep reopening {{}}
{{}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)