TSFenwick commented on PR #14776: URL: https://github.com/apache/druid/pull/14776#issuecomment-1671923077
> I'm not sure there is any value in filtering the list to delete; if a file has already been deleted and a delete is again issued, S3 will just ignore the request. ie: there's no harm in asking for a file to be deleted twice. Unless we are sure that all errors are not retryable, it should suffice to send a retry. @jasonk000 The value is little. I was just over engineering it. I was thinking if you are deleting 100,000 segments there isn't a point to retry each batch delete if only 1 file in each batch fails for a retryable reason. you would get a more performant call by only calling delete on files that you know weren't deleted for a retryable reason. The odds of this happening is low and the code complexity might not be worth it. I think your solution is a simple and efficient solution to this. > My thought is to just retry up to 3 times regardless of failure. What do you think? I think this is a bad idea. we shouldn't retry for all cases since we can and should easily know whats retry-able and whats not. doing unnecessary network calls is something that should be avoided when its easy to avoid since it would just be iterating over a list of at most 1000 deleteErrors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
