Vyacheslav Tutrinov created HDDS-11636:
------------------------------------------
Summary: [S3G] Cache Ozone Key Details
Key: HDDS-11636
URL: https://issues.apache.org/jira/browse/HDDS-11636
Project: Apache Ozone
Issue Type: Improvement
Components: s3gateway
Affects Versions: 2.0.0
Reporter: Vyacheslav Tutrinov
Assignee: Vyacheslav Tutrinov
## Preamble
Two test experiments were made:
* a 1G key was loaded to the bucket as multi-parts (with the default client
settings of {{aws s3}}) and a read operation was made
* the same action was done for the 100G file
For the 1st experiment, I measured the count of execution of the OM.getKeyInfo
method on reading the key, for the second one the time of the key downloading
was measured
## Problem:
The AWS S3 client starts to download a file from a HEAD request, and multiple
further GET requests with the bytes range in the request header are sent to
download the full body of the file.
The thing is that for each GET request the same OM.getKeyInfo request will be
sent to OM, and the same request body and the same response body will be
serialized/deserialized N times:
* 129 times for the first test case
* for the second test case the count wasn't measured but the time of
downloading the key was decreased ~ 4 times (from 11min20sec to 3min30sec) when
I cashed the first result of OM.getKeyInfo and used em without requesting the
same one from the OM
## Enhancement proposal
It seems it makes sense to cache the OM key info on reading the MPU keys with
some idle time to clean the cache automatically when the MPU key downloading is
finished
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]