sririshindra commented on PR #7127: URL: https://github.com/apache/iceberg/pull/7127#issuecomment-1492784798
> > One of the common causes of delete failure in the public cloud is hitting the API quotas and unnecessary re-runs of the delete action > > Not really. What size of pages are you issuing delete requests to S3? Each object, even in a bulk delete, is one Write IOP; if you send a full 1000 entry list then under certain conditions the sleep and retry of that request is it's own thundering herd. Since [HADOOP-16823](https://issues.apache.org/jira/browse/HADOOP-16823) we've had a default page size of 200 and nobody complains about 503-triggered deletion failures, even when partition rebalancing operations massively reduce IO capacity . If you are seeing problems the S3A connector then complain. If you are seeing it in your own code -why not fix it there rather than expose the failure to apps? Hi Steve, I am partially summarizing our offline conversation here. Maybe the statement that "causes of delete failure in the public cloud is hitting the API quotas" is wrong, but I think this PR still adds value. When a remove orphan files procedure is called, currently a list of files that were supposed to be deleted are displayed, but that procedure doesn't give nay information about whether those files were in fact deleted. This PR simply displays the status of the delete file along with the delete file itself. If a failure occurs for some reason, the exception message is also displayed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
