[
https://issues.apache.org/jira/browse/HBASE-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862489#comment-17862489
]
Dieter De Paepe commented on HBASE-28696:
-----------------------------------------
See also: HBASE-28706 (different problem, but concerns this code as well)
> BackupSystemTable can create huge delete batches that should be partitioned
> instead
> -----------------------------------------------------------------------------------
>
> Key: HBASE-28696
> URL: https://issues.apache.org/jira/browse/HBASE-28696
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Ray Mattingly
> Assignee: Ray Mattingly
> Priority: Major
>
> When successfully taking an incremental backup, one of our final steps is to
> delete bulk load metadata from the system table for the bulk loads that
> needed to be captured in the given backup. This means that we will basically
> truncate the entire bulk loads system table in a single batch of the deletes
> after successfully taking an incremental backup. This logic occurs in
> {{{}BackupSystemTable#deleteBulkLoadedRows{}}}:
> {code:java}
> /*
> * Removes rows recording bulk loaded hfiles from backup table
> * @param lst list of table names
> * @param rows the rows to be deleted
> */
> public void deleteBulkLoadedRows(List<byte[]> rows) throws IOException {
> try (Table table = connection.getTable(bulkLoadTableName)) {
> List<Delete> lstDels = new ArrayList<>();
> for (byte[] row : rows) {
> Delete del = new Delete(row);
> lstDels.add(del);
> LOG.debug("orig deleting the row: " + Bytes.toString(row));
> }
> table.delete(lstDels);
> LOG.debug("deleted " + rows.size() + " original bulkload rows");
> }
> } {code}
> Depending on your usage, one may run tons of bulk loads between backups, so
> this design is needlessly fragile. We should partition these deletes so that
> we never erroneously fail a backup due to this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)