[
https://issues.apache.org/jira/browse/HBASE-30160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dev Hingu reassigned HBASE-30160:
---------------------------------
Assignee: Dev Hingu
> Prevent region creation if the encoded region names are the same
> ----------------------------------------------------------------
>
> Key: HBASE-30160
> URL: https://issues.apache.org/jira/browse/HBASE-30160
> Project: HBase
> Issue Type: Sub-task
> Reporter: Balazs Meszaros
> Assignee: Dev Hingu
> Priority: Major
> Labels: pull-request-available
>
> HBase region names are hashed like this: MD5(tableName,startKey,...). With a
> special startKey we can create collisions easily, like this:
> {noformat}
> hbase:001:0> create 'table1', 'f', SPLITS =>
> ["\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00^B\xb9\x99\xdb\xb7\x98W\xfa\xa1\xe0\xf1\xbc\x09h]1S[&u*\x93\xa1&RzF\x87\x9e\x970\x84\xe5\xb9\xe3ln*l\x07\x0c\xef\x03\x96Q\xbdC!\xb1\xdec-\xfb+\x11\x83h\xc1\xbe$\x1f\xae\x95\xaf\xd3W\x07\x8a\x01\xfa\xf1\xba\x83\x8c}\xa5A1\x83\xae\xae\xf8\xe6\xf9\xe5F\xa7\xc9\x1a\xfeM\xec\x07\xdem\x0em\x9e\x97\xf4\x16\x08\x94\xa8\x8a87\x07\xb5v\xac\xe7\x07\x10\x22\xfc\xb9\x1fm\xbd\x13V\xa9\xedX\xf0\xb1",
>
> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00^B\xb9\x99\xdb\xb7\x98W\xfa\xa1\xe0\xf1\xbc\x09h]1S[\xa6u*\x93\xa1&RzF\x87\x9e\x970\x84\xe5\xb9\xe3ln*l\x07\x0c\xef\x03\x96\xd1\xbcC!\xb1\xdec-\xfb+\x11\x83h\xc1>$\x1f\xae\x95\xaf\xd3W\x07\x8a\x01\xfa\xf1\xba\x83\x8c}\xa5A1\x83\xae\xae\xf8f\xf9\xe5F\xa7\xc9\x1a\xfeM\xec\x07\xdem\x0em\x9e\x97\xf4\x16\x08\x94\xa8\x8a87\x075w\xac\xe7\x07\x10\x22\xfc\xb9\x1fm\xbd\x13V)\xedX\xf0\xb1"]
> ERROR: The procedure 9 is still running
> For usage try 'help "create"'
> Took 608.8101 seconds
> {noformat}
> The table creation fails, because hashes are the same:
> {noformat}
> 2026-05-13 09:34:23,762 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> [RegionOpenAndInit-table1-pool-2]: creating {ENCODED =>
> 647314dfe2b7e604e08fd7fd3fec44fc, NAME => 'table1,...
> 2026-05-13 09:34:23,764 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> [RegionOpenAndInit-table1-pool-1]: creating {ENCODED =>
> 647314dfe2b7e604e08fd7fd3fec44fc, NAME => 'table1,...
> 2026-05-13 09:34:23,772 WARN org.apache.hadoop.hdfs.DataStreamer:
> [Thread-140]: DataStreamer Exception
> java.io.FileNotFoundException: File does not exist:
> /hbase/data/default/table1/647314dfe2b7e604e08fd7fd3fec44fc/.regioninfo
> (inode 16653) [Lease. Holder: DFSClient_NONMAPREDUCE_1353520776_1, pending
> creates: 3]
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3194)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:609)
> ...
> {noformat}
> The procedure never finishes and prohibits further creation of {{table1}}.
> This issue should be triggered with splitting the table twice:
> {noformat}
> split 'table1', 'malicious-key1'
> split 'table1', 'malicious-key2'
> {noformat}
> It would be hard to change MD5 to something else, but we should handle these
> collisions better. We should check if the region hashes are the same and fail
> immediately. Under normal circumstances, the chance of a collision with
> automatic splitting is very-very-low.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)