TSFenwick commented on PR #14776:
URL: https://github.com/apache/druid/pull/14776#issuecomment-1671923077

   > I'm not sure there is any value in filtering the list to delete; if a file 
has already been deleted and a delete is again issued, S3 will just ignore the 
request. ie: there's no harm in asking for a file to be deleted twice. Unless 
we are sure that all errors are not retryable, it should suffice to send a 
retry.
   
   @jasonk000 The value is little. I was just over engineering it. I was 
thinking if you are deleting 100,000 segments there isn't a point to retry each 
batch delete if only 1 file in each batch fails for a retryable reason. you 
would get a more performant call by only calling delete on files that you know 
weren't deleted for a retryable reason. The odds of this happening is low and 
the code complexity might not be worth it.
   
   I think your solution is a simple and efficient solution to this.
   
   > My thought is to just retry up to 3 times regardless of failure. What do 
you think?
   
   I think this is a bad idea. we shouldn't retry for all cases since we can 
and should easily know whats retry-able and whats not. doing unnecessary 
network calls is something that should be avoided when its easy to avoid since 
it would just be iterating over a list of at most 1000 deleteErrors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to