a2l007 opened a new pull request #11899:
URL: https://github.com/apache/druid/pull/11899


   While fetching segments from S3, it presently creates an object summary(LIST 
operation) for the segment before proceeding to GET the object and so the 
number of LIST ops are proportional to the number of segments. Since LIST ops 
are more expensive compared to GET, it is desirable to reduce the number of 
list ops especially if the LIST limit Is much smaller than for GETs.
   
   This PR lazily creates the object summary  since it isn't really required 
for pulling segments since the bucket and prefix can be retrieved from the URI 
and the check to validate if the object is present in the bucket is already 
done before attempting to pull the segment. This reduces the list operations 
down to zero while pulling segments.
   
   <hr>
   
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to