[
https://issues.apache.org/jira/browse/HDDS-10395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-10395:
-------------------------------
Description:
Found an eTag incompatibility in S3 listParts while testing the multipart
uploads old S3G and new OMs.
First issue is in KeyManagerImpl#listParts
{code:java}
OmPartInfo omPartInfo = new OmPartInfo(partKeyInfo.getPartNumber(),
partName,
partKeyInfo.getPartKeyInfo().getModificationTime(),
partKeyInfo.getPartKeyInfo().getDataSize(),
partKeyInfo.getPartKeyInfo().getMetadataList().stream()
.filter(keyValue -> keyValue.getKey().equals(ETAG))
.findFirst().get().getValue()); {code}
This will throw "java.util.NoSuchElementException: No value present" in case
where the MPU part does not contain eTag field (before HDDS-9680)
Second issue, is that ObjectEndpoint#listParts is currently only returning the
MPU part eTag, which might not exist for old MPU parts. This can be resolved by
falling back to using partName as eTag if the eTag is not specified.
Third issue is the NPE when calling setETag with null (in OmPartInfo#getProto).
This can be resolved by doing a simple nullity check.
For reference, the issue replicated with the following MPU script which
manually call the s3 API for MPU. It should also be able to be replicated with
"aws s3 cp" for large files since it will use multipart uploads, and most
probably will call the listParts API.
{code:java}
#!/bin/bash
BUCKET_NAME="etag-test-bucket"
KEY_NAME="mpu-key"
ENDPOINT_URL="S3_ENDPOINT"
AWS_ACCESS_KEY_ID="ACCESS_KEY_ID"
AWS_SECRET_ACCESS_KEY="SECRET_ACCESS_KEY"
# Create three files
dd if=/dev/urandom of=/tmp/part1 bs=1M count=10
dd if=/dev/urandom of=/tmp/part2 bs=1M count=10
dd if=/dev/urandom of=/tmp/part3 bs=1M count=10# Define a function to
conditionally add the endpoint url
function aws_cmd {
if [[ -z "$ENDPOINT_URL" ]]; then
AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY aws s3api $@
else
AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY aws s3api --endpoint-url
$ENDPOINT_URL $@
fi
}
# Start the multipart upload and get the upload ID
UPLOAD_ID=$(aws_cmd create-multipart-upload --bucket $BUCKET_NAME --key
$KEY_NAME --query 'UploadId' --output text)
echo "upload id: $UPLOAD_ID"# Upload the parts
ETAG1=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME --part-number
1 --body /tmp/part1 --upload-id $UPLOAD_ID --query 'ETag' --output text)
echo "ETAG1: $ETAG1"
ETAG2=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME --part-number
2 --body /tmp/part2 --upload-id $UPLOAD_ID --query 'ETag' --output text)
echo "ETAG2: $ETAG2"
ETAG3=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME --part-number
3 --body /tmp/part3 --upload-id $UPLOAD_ID --query 'ETag' --output text)
echo "ETAG3: $ETAG3"
# List the MPU parts (
{code}
*This is where the issue was detected*
{code:java}
)
aws_cmd list-parts --bucket $BUCKET_NAME --key $KEY_NAME --upload-id
$UPLOAD_ID# aws_cmd abort-multipart-upload --bucket $BUCKET_NAME --key
$KEY_NAME --upload-id $UPLOAD_ID
aws_cmd complete-multipart-upload --multipart-upload
"Parts=[{ETag=$ETAG1,PartNumber=1},{ETag=$ETAG2,PartNumber=2},{ETag=$ETAG3,PartNumber=3}]"
--bucket $BUCKET_NAME --key $KEY_NAME --upload-id $UPLOAD_ID{code}
was:
Found an eTag incompatibility in S3 listParts while testing the multipart
uploads old S3G and new OMs.
First issue is in KeyManagerImpl#listParts
{code:java}
OmPartInfo omPartInfo = new OmPartInfo(partKeyInfo.getPartNumber(),
partName,
partKeyInfo.getPartKeyInfo().getModificationTime(),
partKeyInfo.getPartKeyInfo().getDataSize(),
partKeyInfo.getPartKeyInfo().getMetadataList().stream()
.filter(keyValue -> keyValue.getKey().equals(ETAG))
.findFirst().get().getValue()); {code}
This will throw "java.util.NoSuchElementException: No value present" in case
where the MPU part does not contain eTag field (before HDDS-9680)
Second issue, is that ObjectEndpoint#listParts is currently only returning the
MPU part eTag, which might not exist for old MPU parts. This can be resolved by
falling back to using partName as eTag if the eTag is not specified.
Third issue is the NPE when calling setETag with null (in OmPartInfo#getProto).
This can be resolved by doing a simple nullity check.
> Fix compatibility issue with eTag during MPU listParts
> ------------------------------------------------------
>
> Key: HDDS-10395
> URL: https://issues.apache.org/jira/browse/HDDS-10395
> Project: Apache Ozone
> Issue Type: Improvement
> Components: OM, Ozone Manager, S3
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> Found an eTag incompatibility in S3 listParts while testing the multipart
> uploads old S3G and new OMs.
> First issue is in KeyManagerImpl#listParts
> {code:java}
> OmPartInfo omPartInfo = new OmPartInfo(partKeyInfo.getPartNumber(),
> partName,
> partKeyInfo.getPartKeyInfo().getModificationTime(),
> partKeyInfo.getPartKeyInfo().getDataSize(),
> partKeyInfo.getPartKeyInfo().getMetadataList().stream()
> .filter(keyValue -> keyValue.getKey().equals(ETAG))
> .findFirst().get().getValue()); {code}
> This will throw "java.util.NoSuchElementException: No value present" in case
> where the MPU part does not contain eTag field (before HDDS-9680)
> Second issue, is that ObjectEndpoint#listParts is currently only returning
> the MPU part eTag, which might not exist for old MPU parts. This can be
> resolved by falling back to using partName as eTag if the eTag is not
> specified.
> Third issue is the NPE when calling setETag with null (in
> OmPartInfo#getProto). This can be resolved by doing a simple nullity check.
> For reference, the issue replicated with the following MPU script which
> manually call the s3 API for MPU. It should also be able to be replicated
> with "aws s3 cp" for large files since it will use multipart uploads, and
> most probably will call the listParts API.
>
> {code:java}
> #!/bin/bash
> BUCKET_NAME="etag-test-bucket"
> KEY_NAME="mpu-key"
> ENDPOINT_URL="S3_ENDPOINT"
> AWS_ACCESS_KEY_ID="ACCESS_KEY_ID"
> AWS_SECRET_ACCESS_KEY="SECRET_ACCESS_KEY"
> # Create three files
> dd if=/dev/urandom of=/tmp/part1 bs=1M count=10
> dd if=/dev/urandom of=/tmp/part2 bs=1M count=10
> dd if=/dev/urandom of=/tmp/part3 bs=1M count=10# Define a function to
> conditionally add the endpoint url
> function aws_cmd {
> if [[ -z "$ENDPOINT_URL" ]]; then
> AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
> AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY aws s3api $@
> else
> AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
> AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY aws s3api --endpoint-url
> $ENDPOINT_URL $@
> fi
> }
> # Start the multipart upload and get the upload ID
> UPLOAD_ID=$(aws_cmd create-multipart-upload --bucket $BUCKET_NAME --key
> $KEY_NAME --query 'UploadId' --output text)
> echo "upload id: $UPLOAD_ID"# Upload the parts
> ETAG1=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME
> --part-number 1 --body /tmp/part1 --upload-id $UPLOAD_ID --query 'ETag'
> --output text)
> echo "ETAG1: $ETAG1"
> ETAG2=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME
> --part-number 2 --body /tmp/part2 --upload-id $UPLOAD_ID --query 'ETag'
> --output text)
> echo "ETAG2: $ETAG2"
> ETAG3=$(aws_cmd upload-part --bucket $BUCKET_NAME --key $KEY_NAME
> --part-number 3 --body /tmp/part3 --upload-id $UPLOAD_ID --query 'ETag'
> --output text)
> echo "ETAG3: $ETAG3"
> # List the MPU parts (
> {code}
> *This is where the issue was detected*
> {code:java}
> )
> aws_cmd list-parts --bucket $BUCKET_NAME --key $KEY_NAME --upload-id
> $UPLOAD_ID# aws_cmd abort-multipart-upload --bucket $BUCKET_NAME --key
> $KEY_NAME --upload-id $UPLOAD_ID
> aws_cmd complete-multipart-upload --multipart-upload
> "Parts=[{ETag=$ETAG1,PartNumber=1},{ETag=$ETAG2,PartNumber=2},{ETag=$ETAG3,PartNumber=3}]"
> --bucket $BUCKET_NAME --key $KEY_NAME --upload-id $UPLOAD_ID{code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]