[jira] [Commented] (HBASE-28643) An unbounded backup failure message can cause an irrecoverable state for the given backup

Hudson (Jira) Tue, 03 Sep 2024 09:16:21 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-28643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878932#comment-17878932
 ]


Hudson commented on HBASE-28643:
--------------------------------

Results for branch branch-3
        [build #279 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/279/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/279/General_20Nightly_20Build_20Report/]








(x) {color:red}-1 jdk17 hadoop3 checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/279/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> An unbounded backup failure message can cause an irrecoverable state for the 
> given backup
> -----------------------------------------------------------------------------------------
>
>                 Key: HBASE-28643
>                 URL: https://issues.apache.org/jira/browse/HBASE-28643
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Ray Mattingly
>            Assignee: Ray Mattingly
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> The BackupInfo class has a failedMsg field which is a string of unbounded 
> length. When a DistCp job fails then its failure message contains all of its 
> source paths, and its failure message gets propagated to this failedMsg field 
> on the given BackupInfo.
> If a DistCp job has enough source paths, then this will result in backup 
> status updates being rejected:
> {noformat}
> java.lang.IllegalArgumentException: KeyValue size too large
>         at 
> org.apache.hadoop.hbase.client.ConnectionUtils.validatePut(ConnectionUtils.java:513)
>         at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1095)
>         at org.apache.hadoop.hbase.client.HTable.lambda$put$3(HTable.java:564)
>         at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:563)
>         at 
> org.apache.hadoop.hbase.backup.impl.BackupSystemTable.updateBackupInfo(BackupSystemTable.java:292)
>         at 
> org.apache.hadoop.hbase.backup.impl.BackupManager.updateBackupInfo(BackupManager.java:376)
>         at 
> org.apache.hadoop.hbase.backup.impl.TableBackupClient.failBackup(TableBackupClient.java:243)
>         at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:317)
>         at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603)
>         at 
> com.hubspot.hbase.recovery.core.backup.BackupManager.lambda$runBackups$2(BackupManager.java:145){noformat}
> Without the ability to update the backup's state, it will never be returned 
> as a failed backup by the client. This means that any mechanisms designed for 
> repairing or cleaning failed backups won't work properly.
> I think that a simple fix here would be fine: we should truncate the 
> failedMsg field to a reasonable maximum size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28643) An unbounded backup failure message can cause an irrecoverable state for the given backup

Reply via email to