Shawn created HDDS-11784:
----------------------------
Summary: parent directory not found when abort multi-part upload
Key: HDDS-11784
URL: https://issues.apache.org/jira/browse/HDDS-11784
Project: Apache Ozone
Issue Type: Improvement
Components: S3
Affects Versions: 1.4.0
Reporter: Shawn
We observed lots of open key (files) in our FSO enabled ozone cluster. And
these are all incomplete MPU keys.
When I tried to abort MPU by using s3 cli as below, I got the exception
complaining about the parent directory is not found.
```
aws s3api abort-multipart-upload --endpoint 'xxxx' --bucket
'2e76bd0f-9682-42c6-a5ce-3e32c5aa37b2' --key
'CACHE.06e656c0-6622-48bb-89c2-39470764b1d0/ALULA_DICKINSON_ORIGINAL_101_EPISODE_DVSAUDIO_EN8CH_DOWNLOAD_FINAL_VDKSN0560101.mov'
--upload-id '4103c881-24fa-4992-b7b2-5474f8a7fbaf-113204926929050074'
An error occurred (NoSuchUpload) when calling the AbortMultipartUpload
operation: The specified multipart upload does not exist. The upload ID might
be invalid, or the multipart upload might have been aborted or completed.
```
Exceptions in the log
```
NO_SUCH_MULTIPART_UPLOAD_ERROR
org.apache.hadoop.ozone.om.exceptions.OMException: Abort Multipart Upload
Failed: volume: s3v, bucket: 2e76bd0f-9682-42c6-a5ce-3e32c5aa37b2, key:
CACHE.06e656c0-6622-48bb-89c2-39470764b1d0/ALULA_DICKINSON_ORIGINAL_101_EPISODE_DVSAUDIO_EN8CH_DOWNLOAD_FINAL_VDKSN0560101.mov
at
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadAbortRequest.validateAndUpdateCache(S3MultipartUploadAbortRequest.java:148)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:402)
at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:39)
at
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:398)
at
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:587)
at
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:375)
at
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: DIRECTORY_NOT_FOUND
org.apache.hadoop.ozone.om.exceptions.OMException: Failed to find parent
directory of
CACHE.06e656c0-6622-48bb-89c2-39470764b1d0/ALULA_DICKINSON_ORIGINAL_101_EPISODE_DVSAUDIO_EN8CH_DOWNLOAD_FINAL_VDKSN0560101.mov
at
org.apache.hadoop.ozone.om.request.file.OMFileRequest.getParentID(OMFileRequest.java:1038)
at
org.apache.hadoop.ozone.om.request.file.OMFileRequest.getParentID(OMFileRequest.java:988)
at
org.apache.hadoop.ozone.om.request.util.OMMultipartUploadUtils.getMultipartOpenKeyFSO(OMMultipartUploadUtils.java:122)
at
org.apache.hadoop.ozone.om.request.util.OMMultipartUploadUtils.getMultipartOpenKey(OMMultipartUploadUtils.java:99)
at
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadAbortRequest.getMultipartOpenKey(S3MultipartUploadAbortRequest.java:256)
at
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadAbortRequest.validateAndUpdateCache(S3MultipartUploadAbortRequest.java:145)
... 9 more
```
This issue is similar as the issue
[HDDS-10630](https://issues.apache.org/jira/browse/HDDS-10630). We should bring
the same similar fix here. Without this, all these dangling MPU cannot be
cleaned up either manually or the background cleanup service.
Also we are not sure what the root cause for these missing parent directories.
Need some investigation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]