[
https://issues.apache.org/jira/browse/HDDS-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze updated HDDS-8238:
-----------------------------
Description:
When using multipart upload to upload a 1000-part key. OM RocksDb needs to use
733.15 MB disk space. For a 10,000-part key, it can use 71.5 GB . The disk
space usage is O(n^2) for a n-part key.
MultipartKeyInfo has a list of PartKeyInfo (partKeyInfoList) while each
PartKeyInfo has a KeyInfo and the size of each KeyInfo is ~1.5 KB.
{code:java}
message MultipartKeyInfo {
required string uploadID = 1;
...
repeated PartKeyInfo partKeyInfoList = 5;
...
}
message PartKeyInfo {
required string partName = 1;
required uint32 partNumber = 2;
required KeyInfo partKeyInfo = 3; // the size of a KeyInfo is ~1.5 KB
}
{code}
Multipart upload repeatedly put the same key with a MultipartKeyInfo object for
each part.
# When uploading the first part, partKeyInfoList has 1 element.
# When uploading the second part, partKeyInfoList has 2 elements.
# When uploading the third part, partKeyInfoList has 3 elements.
...
1000. When uploading the 1000th part, partKeyInfoList has 1000 elements.
Then, the number of PartKeyInfo objects uploaded is
- (1 + 1000)*1000/2 = 500,500 and
- the size is 1.5 KB * 500,500 = 733.15 MB.
> Multipart upload generates a huge OM rocksdb log
> ------------------------------------------------
>
> Key: HDDS-8238
> URL: https://issues.apache.org/jira/browse/HDDS-8238
> Project: Apache Ozone
> Issue Type: Improvement
> Components: OM
> Reporter: Tsz-wo Sze
> Priority: Major
>
> When using multipart upload to upload a 1000-part key. OM RocksDb needs to
> use 733.15 MB disk space. For a 10,000-part key, it can use 71.5 GB . The
> disk space usage is O(n^2) for a n-part key.
> MultipartKeyInfo has a list of PartKeyInfo (partKeyInfoList) while each
> PartKeyInfo has a KeyInfo and the size of each KeyInfo is ~1.5 KB.
> {code:java}
> message MultipartKeyInfo {
> required string uploadID = 1;
> ...
> repeated PartKeyInfo partKeyInfoList = 5;
> ...
> }
> message PartKeyInfo {
> required string partName = 1;
> required uint32 partNumber = 2;
> required KeyInfo partKeyInfo = 3; // the size of a KeyInfo is ~1.5 KB
> }
> {code}
> Multipart upload repeatedly put the same key with a MultipartKeyInfo object
> for each part.
> # When uploading the first part, partKeyInfoList has 1 element.
> # When uploading the second part, partKeyInfoList has 2 elements.
> # When uploading the third part, partKeyInfoList has 3 elements.
> ...
> 1000. When uploading the 1000th part, partKeyInfoList has 1000 elements.
> Then, the number of PartKeyInfo objects uploaded is
> - (1 + 1000)*1000/2 = 500,500 and
> - the size is 1.5 KB * 500,500 = 733.15 MB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]