Re: [PR] HBASE-28697 Don't clean bulk load system entries until backup is complete [hbase]

via GitHub Fri, 19 Jul 2024 09:00:36 -0700


rmdmattingly commented on code in PR #6089:
URL: https://github.com/apache/hbase/pull/6089#discussion_r1684580792



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java:
##########
@@ -103,13 +103,14 @@ protected static int getIndex(TableName tbl, 
List<TableName> sTableList) {
 
   /*
    * Reads bulk load records from backup table, iterates through the records 
and forms the paths for
-   * bulk loaded hfiles. Copies the bulk loaded hfiles to backup destination
+   * bulk loaded hfiles. Copies the bulk loaded hfiles to backup destination. 
This method does NOT
+   * clean up the entries in the bulk load system table. Those entries should 
not be cleaned until
+   * the backup is marked as complete.
    * @param sTableList list of tables to be backed up
-   * @return map of table to List of files
+   * @return the rowkeys of bulk loaded files
    */
   @SuppressWarnings("unchecked")
-  protected Map<byte[], List<Path>>[] handleBulkLoad(List<TableName> 
sTableList)
-    throws IOException {
+  protected List<byte[]> handleBulkLoad(List<TableName> sTableList) throws 
IOException {

Review Comment:
   We talked about this a bit offline. We can purge these rowkeys because they 
are only returned by `handleBulkload` if we have bulk loaded the keys in this 
backup.
   
   Right now an inopportune failure would result in us missing bulk load data 
on subsequent incremental backups, but with this change an inopportune failure 
would result is us backing up duplicative files which should be just a little 
bit wasteful, but otherwise innocuous



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-28697 Don't clean bulk load system entries until backup is complete [hbase]

Reply via email to