Dieter De Paepe created HBASE-28103:
---------------------------------------

             Summary: HBase backup repair stuck after failed delete due to 
missing S3 credentials
                 Key: HBASE-28103
                 URL: https://issues.apache.org/jira/browse/HBASE-28103
             Project: HBase
          Issue Type: Bug
            Reporter: Dieter De Paepe


I was experimenting what happens if a user were to execute `hbase backupe 
delete` without providing S3 credentials.

I started with a backup present in a S3 bucket.

 
{noformat}
hbase backup history

{ID=backup_1695226626227,Type=FULL,Tables={foo:bar},State=COMPLETE,Start 
time=Wed Sep 20 16:17:09 UTC 2023,End time=Wed Sep 20 16:17:42 UTC 
2023,Progress=100%}
{noformat}
I tried to delete this without providing S3 credentials, it failed (as 
expected).

 

 
{noformat}
hbase backup delete -l backup_1695226626227


23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.impl.BackupAdminImpl: 
Delete operation failed, please run backup repair utility to restore backup 
system integrity
java.nio.file.AccessDeniedException: 
s3a://backuprestore-experiments/hbase/backup_1695226626227: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
com.amazonaws.SdkClientException: Unable to load AWS credentials from 
environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY 
(or AWS_SECRET_ACCESS_KEY))
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
    at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
    at 
org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
    at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
    at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
    at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
    at 
org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
    at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS 
Credentials provided by TemporaryAWSCredentialsProvider 
SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to 
load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or 
AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
    at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
    at 
com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
    at 
com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
    at 
com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
    at 
org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials 
from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
    at 
com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
    at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
    ... 28 more
Delete command FAILED. Please run backup repair tool to restore backup system 
integrity
23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
running command-line tool
java.nio.file.AccessDeniedException: 
s3a://backuprestore-experiments/hbase/backup_1695226626227: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
com.amazonaws.SdkClientException: Unable to load AWS credentials from 
environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY 
(or AWS_SECRET_ACCESS_KEY))
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
    at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
    at 
org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
    at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
    at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
    at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
    at 
org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
    at 
org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
    at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS 
Credentials provided by TemporaryAWSCredentialsProvider 
SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to 
load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or 
AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
    at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
    at 
com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
    at 
com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
    at 
com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
    at 
org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials 
from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
    at 
com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
    at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
    ... 28 more
{noformat}
At this point, I cannot start a new backup because a failed delete command is 
present:

 

 
{noformat}
hbase backup \
  -libjars 
/opt/hadoop/share/hadoop/tools/lib/hadoop-aws-3.3.6-1-lily.jar,/opt/hadoop/share/hadoop/tools/lib/aws-java-sdk-bundle-1.12.367.jar
 \
  -Dfs.s3a.access.key=... \
  -Dfs.s3a.secret.key=... \
  -Dfs.s3a.session.token=... \
   create incremental s3a://backuprestore-experiments/hbase -t foo:bar 

Found failed backup DELETE coommand. 
Backup system recovery is required.
23/09/20 16:31:16 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
running command-line tool
java.io.IOException: Failed backup DELETE found, aborted command execution
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$Command.execute(BackupCommands.java:167)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:309)
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
    at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
{noformat}
However, backup is unable to complete.

 

 
{noformat}
hbase backup repair

REPAIR status: no failed sessions found. Checking failed delete backup 
operation ...
Found failed DELETE operation for: backup_1695226626227
Running DELETE again ...
23/09/20 16:34:13 WARN org.apache.hadoop.hbase.backup.impl.BackupSystemTable: 
Could not restore backup system table. Snapshot snapshot_backup_system does not 
exists.
23/09/20 16:34:13 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
running command-line tool
java.io.IOException: There is no active backup exclusive operation
    at 
org.apache.hadoop.hbase.backup.impl.BackupSystemTable.finishBackupExclusiveOperation(BackupSystemTable.java:645)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.repairFailedBackupDeletionIfAny(BackupCommands.java:721)
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.execute(BackupCommands.java:681)
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
    at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
{noformat}
The core issue seems to be the assumption that there is a "backup exclusive 
operation" for each failed delete command.

A good feature would also be to allow the repair command to delete the pending 
delete. Though I guess that in some cases that may not result in a reliable 
state if data was already partially deleted.

The workaround in this case would be to delete the delete commands from the 
backup table I guess?

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to