[ 
https://issues.apache.org/jira/browse/HBASE-29554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18021178#comment-18021178
 ] 

Emil Kleszcz edited comment on HBASE-29554 at 9/18/25 1:56 PM:
---------------------------------------------------------------

I have added a patch of the compareRows method in the MetaCellComparator that 
parses correctly a single comma row and allows to compare them correctly 
without changing the previous logic for comparing the rowkeys: _[table],[region 
start key],[region id]_
_or [table]_
This change allows us to parse the single comma rowkeys that will be compared 
as row bytes same way as in the normal tables. This enables deletions or 
updates on the rows that have been wrongly inserted into the meta, and which 
are allowed to be inserted.

Please review and let me know if this makes sense. If so I would be keen to 
provide this patch upstream.


was (Author: ekleszcz):
I have added a patch of the compareRows method in the MetaCellComparator that 
parses correctly a single comma row and allows to compare them correctly 
without changing the previous logic for comparing the rowkeys: _[table],[region 
start key],[region id]_
_or [table]_
This change allows us to parse the single comma rowkeys that will be compared 
as row bytes same way as in the normal tables. This enables deletions or 
updates on the rows that have been wrongly inserted into the meta, and which 
are allowed to be inserted.

> Corrupted hbase:meta rowkeys cannot be deleted
> ----------------------------------------------
>
>                 Key: HBASE-29554
>                 URL: https://issues.apache.org/jira/browse/HBASE-29554
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.5.10
>            Reporter: Emil Kleszcz
>            Priority: Major
>         Attachments: 
> 0001-HBASE-29554-Fix-MetaCellComparator-to-parse-properly.patch
>
>
> *Context:*
> It is possible to corrupt the {{hbase:meta}} table by inserting rowkeys 
> containing a single comma using commands like:
> {code:java}
> put 'hbase:meta', 'rowkey,', 'info:regioninfo', 'test'{code}
> Example of resulting rows in `{{{}hbase:meta`{}}}:
> {code:java}
> ROW                                                 COLUMN+CELL
>  ,                                                  column=info:regioninfo, 
> timestamp=2025-08-26T21:49:33.427, value= 
>  rowkey,                                            column=info:regioninfo, 
> timestamp=2025-08-26T22:05:52.653, value=
>  rowkey,                                            column=info:regioninfo, 
> timestamp=2025-08-26T21:36:56.560, value=
>  rowkey,a                                           column=info:regioninfo, 
> timestamp=2025-08-26T22:29:06.576, value= {code}
> *Problem:*
>  * When a rowkey in `{{{}hbase:meta`{}}} contains a single comma, scanners 
> return the rowkeys correctly, but {{{}get{}}}, {{{}delete{}}}, and 
> {{deleteall}} operations {*}do not work on these rowkeys{*}.
>  * This behavior is specific to {{hbase:meta}} and does not occur in user 
> tables.
> *Attempted Workarounds:*
> Tried deleting the corrupted row using the Java HBase client:
> {code:java}
> import org.apache.hadoop.hbase.client.ConnectionFactory
> import org.apache.hadoop.hbase.client.Table
> import org.apache.hadoop.hbase.TableName
> import org.apache.hadoop.hbase.client.Delete
> import org.apache.hadoop.hbase.util.Bytes
> connection = ConnectionFactory.createConnection
> tabletable = connection.getTable(TableName.valueOf("hbase:meta"))
> keyrowKey = Bytes.toBytes("rowkey,")
> delete = Delete.new(rowKey)
> delete.setTimestamp(Long::MAX_VALUE)
> delete.addColumn(Bytes.toBytes("info"), Bytes.toBytes("state"))
> table.delete(delete)
> table.close
> connection.close{code}
> *Result:* Unable to delete the corrupted row.
> It is possible to insert a properly serialized region info for such keys:
> {code:java}
> rowKey = Bytes.toBytes("rowkey,")
> tableName = TableName.valueOf("TEST:TEST")
> Infobuilder = RegionInfoBuilder.newBuilder(tableName)
> builder.setStartKey(Bytes.toBytes(""))
> builder.setEndKey(Bytes.toBytes("1"))
> builder.setRegionId(12345)
> fakeRegion = builder.build()
> serializedValue = RegionInfo.toByteArray(fakeRegion)
> put = Put.new(rowKey)put.addColumn(HConstants::CATALOG_FAMILY, 
> HConstants::REGIONINFO_QUALIFIER, serializedValue) {code}
> Resulting row in {{{}hbase:meta{}}}:
> {code:java}
>  rowkey,                                            column=info:regioninfo, 
> timestamp=2025-08-27T17:02:21.565, value={ENCODED => 
> a6839d33e016dc75cfb9ac9c74a576c8, NAME => 'TEST:TEST,,1.a6839d33e0
>                                                     16dc75cfb9ac9c74a576c8.', 
> STARTKEY => '', ENDKEY => '1'} {code}
>  
> *Problems / Risks:*
>  # Corrupted rowkeys with commas in {{hbase:meta}} cannot be fixed or removed 
> on a running cluster.
>  # Allowing insertion of such rowkeys is risky. If insertion is allowed, 
> there must be a mechanism for administrators to clean them up.
>  # {{HBCK2}} does not resolve the issue (e.g., {{fixMeta}} fails).
> *Request / Suggested Action:*
>  * Investigate why rowkeys with a single comma break deletion in 
> {{{}hbase:meta{}}}.
>  * Consider adding safeguards to prevent insertion of rowkeys with wrong 
> format into {{{}hbase:meta{}}}.
>  * Provide an administrative method to safely remove or repair corrupted meta 
> rowkeys.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to