[ 
https://issues.apache.org/jira/browse/HDDS-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-11532:
-------------------------------
    Description: 
[https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.html]

!image-2024-10-05-15-26-03-339.png|width=492,height=69!

ListMultipartUploads result enforces sorting order based on key and time (only 
for general purpose bucket).

Currently, Ozone "sorts" the result based on the upload ID since the DB key for 
multipartInfoTable is 
"/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/\{OBJECT_ID}". Since the DB key for 
multipart uploads with the same key has the same prefix 
"/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/", the key-based sorting should be 
handled (assuming DB keys only contains ASCII characters, see the note at the 
bottom). Therefore, to enforce the time-based sorting, we can sort the 
multipart uploads with the same key based on the initiate time.

Note: Currently the StringCodec uses UTF-8 which might be encoded to different 
number of bytes depending of the character, and since RocksDB only cares about 
bytes, these might cause some unexpected sorting order. However, AFAIK for 
ASCII characters, UTF-8 will encoding is always 1 byte, so if the DB key are 
always ASCII, the key name sorting might hold. 

  was:
[https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.html]

!image-2024-10-05-15-26-03-339.png|width=492,height=69!

ListMultipartUploads result enforces sorting order based on key (for all bucket 
types) and time (only for general purpose bucket).

Currently, Ozone "sorts" the result based on the upload ID since the DB key for 
multipartInfoTable is 
"/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/\{OBJECT_ID}". Since the DB key for 
multipart uploads with the same key has the same prefix 
"/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/", the key-based sorting should be 
handled (assuming DB keys only contains ASCII characters, see the note at the 
bottom). Therefore, to enforce the time-based sorting, we can sort the 
multipart uploads with the same key based on the initiate time.

Note: Currently the StringCodec uses UTF-8 which might be encoded to different 
number of bytes depending of the character, and since RocksDB only cares about 
bytes, these might cause some unexpected sorting order. However, AFAIK for 
ASCII characters, UTF-8 will encoding is always 1 byte, so if the DB key are 
always ASCII, the key name sorting might hold. 


> Sort multipart uploads on ListMultipartUploads response
> -------------------------------------------------------
>
>                 Key: HDDS-11532
>                 URL: https://issues.apache.org/jira/browse/HDDS-11532
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: s3gateway
>            Reporter: Ivan Andika
>            Priority: Minor
>         Attachments: image-2024-10-05-15-26-03-339.png
>
>
> [https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.html]
> !image-2024-10-05-15-26-03-339.png|width=492,height=69!
> ListMultipartUploads result enforces sorting order based on key and time 
> (only for general purpose bucket).
> Currently, Ozone "sorts" the result based on the upload ID since the DB key 
> for multipartInfoTable is 
> "/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/\{OBJECT_ID}". Since the DB key 
> for multipart uploads with the same key has the same prefix 
> "/\{VOLUME_NAME}/\{BUCKET_NAME}/\{KEY_NAME}/", the key-based sorting should 
> be handled (assuming DB keys only contains ASCII characters, see the note at 
> the bottom). Therefore, to enforce the time-based sorting, we can sort the 
> multipart uploads with the same key based on the initiate time.
> Note: Currently the StringCodec uses UTF-8 which might be encoded to 
> different number of bytes depending of the character, and since RocksDB only 
> cares about bytes, these might cause some unexpected sorting order. However, 
> AFAIK for ASCII characters, UTF-8 will encoding is always 1 byte, so if the 
> DB key are always ASCII, the key name sorting might hold. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to