Ray Mattingly created HBASE-28696:
-------------------------------------

             Summary: BackupSystemTable can create huge delete batches that 
should be partitioned instead
                 Key: HBASE-28696
                 URL: https://issues.apache.org/jira/browse/HBASE-28696
             Project: HBase
          Issue Type: Bug
            Reporter: Ray Mattingly


When successfully taking an incremental backup, one of our final steps is to 
delete bulk load metadata from the system table for the bulk loads that needed 
to be captured in the given backup. This means that we will basically truncate 
the entire bulk loads system table in a single batch of the deletes after 
successfully taking an incremental backup. This logic occurs in 
{{{}BackupSystemTable#deleteBulkLoadedRows{}}}:
{code:java}
/*
 * Removes rows recording bulk loaded hfiles from backup table
 * @param lst list of table names
 * @param rows the rows to be deleted
 */
public void deleteBulkLoadedRows(List<byte[]> rows) throws IOException {
  try (Table table = connection.getTable(bulkLoadTableName)) {
    List<Delete> lstDels = new ArrayList<>();
    for (byte[] row : rows) {
      Delete del = new Delete(row);
      lstDels.add(del);
      LOG.debug("orig deleting the row: " + Bytes.toString(row));
    }
    table.delete(lstDels);
    LOG.debug("deleted " + rows.size() + " original bulkload rows");
  }
} {code}
Depending on your usage, one may run tons of bulk loads between backups, so 
this design is needlessly fragile. We should partition these deletes so that we 
never erroneously fail a backup due to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to