[ 
https://issues.apache.org/jira/browse/HDDS-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-10521:
-------------------------------
    Description: 
A user encountered this error when it tries to download a key using the AWS S3 
SDK (version 1.11.415).
{code:java}
ERROR FileOperationServiceImpl - s3Client.getObject invoke error, 
objectKey:<redacted>. (FileOperationServiceImpl.java:438)
java.lang.ArrayIndexOutOfBoundsException: 110
    at com.amazonaws.util.Base16Codec.pos(Base16Codec.java:96)
    at com.amazonaws.util.Base16Codec.decode(Base16Codec.java:87)
    at com.amazonaws.util.Base16Lower.decode(Base16Lower.java:53)
    at com.amazonaws.util.BinaryUtils.fromHex(BinaryUtils.java:48)
    at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1456)
    ...{code}
Although the key was able to be downloaded using AWS s3api
{code:java}
aws s3api --endpoint <redacted> get-object --bucket <redacted> --key <redacted> 
download.txt
{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 28 Feb 2024 11:06:30 GMT",
    "ContentLength": 1117055,
    "ETag": "\"null\"",
    "CacheControl": "no-cache",
    "ContentType": "application/octet-stream",
    "Expires": "Wed, 13 Mar 2024 08:10:05 GMT",
    "Metadata": {}
} {code}
The problem can be replicated even with the current version by uploading a file 
with ofs and downloading it using AWS S3 SDK.

It seems object without ETag field was able to be downloaded using AWS CLI, but 
not AWS SDK. 

After looking at the AWS SDK code it seems that AWS SDK will do a 
post-processing step that will validate the ETag field of the downloaded object 
to the object's content. If the ETag field is null, the post-processing step 
will skip the validation.

Currently, S3G returns a string "null" for the ETag field if the ETag field 
does not exist, which should cause the AWS SDK to not be able to parse the 
string since it md5 string is longer than the "null" string. This is. most 
probably why there is an ArrayIndexOutOfBoundsException

One possible solution is to not return the ETag field at all if the key does 
not contain ETag to begin with. This way the post processing step in the AWS 
SDK will not validate the md5 hash.

  was:
A user encountered this error when it tries to download a key using the AWS S3 
SDK.
{code:java}
ERROR FileOperationServiceImpl - s3Client.getObject invoke error, 
objectKey:<redacted>. (FileOperationServiceImpl.java:438)
java.lang.ArrayIndexOutOfBoundsException: 110
    at com.amazonaws.util.Base16Codec.pos(Base16Codec.java:96)
    at com.amazonaws.util.Base16Codec.decode(Base16Codec.java:87)
    at com.amazonaws.util.Base16Lower.decode(Base16Lower.java:53)
    at com.amazonaws.util.BinaryUtils.fromHex(BinaryUtils.java:48)
    at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1456)
    ...{code}
Although the key was able to be downloaded using AWS s3api
{code:java}
aws s3api --endpoint <redacted> get-object --bucket <redacted> --key <redacted> 
download.txt
{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 28 Feb 2024 11:06:30 GMT",
    "ContentLength": 1117055,
    "ETag": "\"null\"",
    "CacheControl": "no-cache",
    "ContentType": "application/octet-stream",
    "Expires": "Wed, 13 Mar 2024 08:10:05 GMT",
    "Metadata": {}
} {code}
The problem can be replicated even with the current version by uploading a file 
with ofs and downloading it using AWS S3 SDK.

It seems object without ETag field was able to be downloaded using AWS CLI, but 
not AWS SDK. 

After looking at the AWS SDK code it seems that AWS SDK will do a 
post-processing step that will validate the ETag field of the downloaded object 
to the object's content. If the ETag field is null, the post-processing step 
will skip the validation.

Currently, S3G returns a string "null" for the ETag field if the ETag field 
does not exist, which should cause the AWS SDK to not be able to parse the 
string since it md5 string is longer than the "null" string. This is. most 
probably why there is an ArrayIndexOutOfBoundsException

One possible solution is to not return the ETag field at all if the key does 
not contain ETag to begin with. This way the post processing step in the AWS 
SDK will not validate the md5 hash.


> ETag field should not be returned during GetObject if the key does not 
> contain ETag field
> -----------------------------------------------------------------------------------------
>
>                 Key: HDDS-10521
>                 URL: https://issues.apache.org/jira/browse/HDDS-10521
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: s3gateway
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> A user encountered this error when it tries to download a key using the AWS 
> S3 SDK (version 1.11.415).
> {code:java}
> ERROR FileOperationServiceImpl - s3Client.getObject invoke error, 
> objectKey:<redacted>. (FileOperationServiceImpl.java:438)
> java.lang.ArrayIndexOutOfBoundsException: 110
>     at com.amazonaws.util.Base16Codec.pos(Base16Codec.java:96)
>     at com.amazonaws.util.Base16Codec.decode(Base16Codec.java:87)
>     at com.amazonaws.util.Base16Lower.decode(Base16Lower.java:53)
>     at com.amazonaws.util.BinaryUtils.fromHex(BinaryUtils.java:48)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1456)
>     ...{code}
> Although the key was able to be downloaded using AWS s3api
> {code:java}
> aws s3api --endpoint <redacted> get-object --bucket <redacted> --key 
> <redacted> download.txt
> {
>     "AcceptRanges": "bytes",
>     "LastModified": "Wed, 28 Feb 2024 11:06:30 GMT",
>     "ContentLength": 1117055,
>     "ETag": "\"null\"",
>     "CacheControl": "no-cache",
>     "ContentType": "application/octet-stream",
>     "Expires": "Wed, 13 Mar 2024 08:10:05 GMT",
>     "Metadata": {}
> } {code}
> The problem can be replicated even with the current version by uploading a 
> file with ofs and downloading it using AWS S3 SDK.
> It seems object without ETag field was able to be downloaded using AWS CLI, 
> but not AWS SDK. 
> After looking at the AWS SDK code it seems that AWS SDK will do a 
> post-processing step that will validate the ETag field of the downloaded 
> object to the object's content. If the ETag field is null, the 
> post-processing step will skip the validation.
> Currently, S3G returns a string "null" for the ETag field if the ETag field 
> does not exist, which should cause the AWS SDK to not be able to parse the 
> string since it md5 string is longer than the "null" string. This is. most 
> probably why there is an ArrayIndexOutOfBoundsException
> One possible solution is to not return the ETag field at all if the key does 
> not contain ETag to begin with. This way the post processing step in the AWS 
> SDK will not validate the md5 hash.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to