steveloughran commented on PR #6596: URL: https://github.com/apache/hadoop/pull/6596#issuecomment-1980987839
When `mapreduce.manifest.committer.validate.output` I am going to add pre-rename validation phase we scan the task attempt to directory tree to make sure everything is there and with the expected etag. This is simply a repetition of the tree walk. I'm going to add rate limiting for all IO within the committer. Currently there's a rate limiter for the committer rename -in the FS instance, so shared across all threads- but nothing else. my plan is in hadoop common to add a new interface `RateLimiterSource` where you can then ask for the `getRateLimiter(String Path, String operation)`; with well-known operation names. ABFS Will provide this API and just return the existing rename limiter everywhere. In S3A I'll add separate limiters for read and write operations which can be configured for the 5000 and 3500 limits under a prefix. The WiP bulk delete API #6494 will use this for its rate limiting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
