DieterDP-ng commented on code in PR #6088:
URL: https://github.com/apache/hbase/pull/6088#discussion_r1690024086
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java:
##########
@@ -293,7 +293,19 @@ public void updateBackupInfo(BackupInfo info) throws
IOException {
}
try (Table table = connection.getTable(tableName)) {
Put put = createPutForBackupInfo(info);
- table.put(put);
+ try {
+ table.put(put);
+ } catch (Exception e) {
+ // If the BackupInfo update can't be processed, then we should fall
back to
+ // the previous BackupInfo, but also update it to reflect the failure.
+ LOG.error("Failed to update BackupInfo for {}. Marking as failed",
info.getBackupId(), e);
+ BackupInfo legacyInfo = readBackupInfo(info.getBackupId());
+ if (legacyInfo != null) {
+ legacyInfo.setFailedMsg("Failed to update BackupInfo. Error: " +
e.getMessage());
+ table.put(createPutForBackupInfo(legacyInfo));
Review Comment:
I don't think adding this logic here will have any noticable effect. Since
an Exception is thrown in this method, calling code will most likely end up in
`TableBackupClient#failBackup`, which will overwrite the failure message (with
the same message: the one from the Exception).
So in the interest of keeping things simple, I'm pro reverting this specific
change. The limiting of the error message length should already ensure a proper
error gets stored.
I wouldn't oppose some kind of retry mechanism for failed PUTs, but I'd
expect that connection issues are automatically retried when executing PUTs,
and a content-based PUT failure will still fail when retried.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]