[ 
https://issues.apache.org/jira/browse/HBASE-27825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell updated HBASE-27825:
----------------------------------------
    Description: 
GCRegionProcedure fails cleaning up old parents after splits. We time out 
“renaming” files into the archive. On S3, a rename operation is a whole file 
copy operation. It can take a long time to copy a large hfile.

{noformat}
[PEWorker-21] backup.HFileArchiver: Failed to archive FileablePath, s3a://[...]
java.net.SocketTimeoutException: copyFile(data/default/cluster_test/[...], 
archive/data/default/cluster_test/[...]) on data/default/cluster_test/[...]: 
com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed out
{noformat}

Once we fail to “rename” the files into the archive we continue to fail because 
renames on S3 are not atomic. They are an object copy operation which is 
neither atomic nor automatically rolled back. The incomplete object remains 
present. The GCRegionProcedure can never complete successfully.

{noformat}
org.apache.hadoop.fs.FileAlreadyExistsException: Failed to rename s3a://[...] 
to s3a://[...]; destination file exists
{noformat}

{noformat}
org.apache.hadoop.hbase.backup.FailedArchiveException: Failed to archive/delete 
all the files for region:ddaf1fb41197254483dcfd1d63e869d0 into s3a://[...]. 
Something is probably awry on the filesystem.
{noformat}

Short term mitigations:

In HFileArchiver#resolveAndArchiveFile, if moveAndClose of the current file 
fails, attempt to delete the incomplete archive side file.

Also set the recommended default read timeout for S3A to a larger value.

Long term:

When the file based store file tracker is enabled, the archived files for a 
store should not longer be moved to a separate path from the live files in the 
store. Instead whether or not the file is archived or not should be a status 
bit maintained in the tracker manifest.

  was:
GCRegionProcedure fails cleaning up old parents after splits. We time out 
“renaming” files into the archive. On S3, a rename operation is a whole file 
copy operation. The time required for the S3 operation to complete is O(n) with 
respect to the length of the file.

{noformat}
[PEWorker-21] backup.HFileArchiver: Failed to archive FileablePath, s3a://[...]
java.net.SocketTimeoutException: copyFile(data/default/cluster_test/[...], 
archive/data/default/cluster_test/[...]) on data/default/cluster_test/[...]: 
com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed out
{noformat}

Once we fail to “rename” the files into the archive we continue to fail because 
renames on S3 are not atomic. They are an object copy operation which is 
neither atomic nor automatically rolled back. The incomplete object remains 
present. The GCRegionProcedure can never complete successfully.

{noformat}
org.apache.hadoop.fs.FileAlreadyExistsException: Failed to rename s3a://[...] 
to s3a://[...]; destination file exists
{noformat}

{noformat}
org.apache.hadoop.hbase.backup.FailedArchiveException: Failed to archive/delete 
all the files for region:ddaf1fb41197254483dcfd1d63e869d0 into s3a://[...]. 
Something is probably awry on the filesystem.
{noformat}

Short term mitigations:

In HFileArchiver#resolveAndArchiveFile, if moveAndClose of the current file 
fails, attempt to delete the incomplete archive side file.

Also set the recommended default read timeout for S3A to a larger value.

Long term:

When the file based store file tracker is enabled, the archived files for a 
store should not longer be moved to a separate path from the live files in the 
store. Instead whether or not the file is archived or not should be a status 
bit maintained in the tracker manifest.


> Store file archiving is not sufficiently indirected through the store file 
> tracker
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-27825
>                 URL: https://issues.apache.org/jira/browse/HBASE-27825
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.5.4
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>
> GCRegionProcedure fails cleaning up old parents after splits. We time out 
> “renaming” files into the archive. On S3, a rename operation is a whole file 
> copy operation. It can take a long time to copy a large hfile.
> {noformat}
> [PEWorker-21] backup.HFileArchiver: Failed to archive FileablePath, 
> s3a://[...]
> java.net.SocketTimeoutException: copyFile(data/default/cluster_test/[...], 
> archive/data/default/cluster_test/[...]) on data/default/cluster_test/[...]: 
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed 
> out
> {noformat}
> Once we fail to “rename” the files into the archive we continue to fail 
> because renames on S3 are not atomic. They are an object copy operation which 
> is neither atomic nor automatically rolled back. The incomplete object 
> remains present. The GCRegionProcedure can never complete successfully.
> {noformat}
> org.apache.hadoop.fs.FileAlreadyExistsException: Failed to rename s3a://[...] 
> to s3a://[...]; destination file exists
> {noformat}
> {noformat}
> org.apache.hadoop.hbase.backup.FailedArchiveException: Failed to 
> archive/delete all the files for region:ddaf1fb41197254483dcfd1d63e869d0 into 
> s3a://[...]. Something is probably awry on the filesystem.
> {noformat}
> Short term mitigations:
> In HFileArchiver#resolveAndArchiveFile, if moveAndClose of the current file 
> fails, attempt to delete the incomplete archive side file.
> Also set the recommended default read timeout for S3A to a larger value.
> Long term:
> When the file based store file tracker is enabled, the archived files for a 
> store should not longer be moved to a separate path from the live files in 
> the store. Instead whether or not the file is archived or not should be a 
> status bit maintained in the tracker manifest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to