steveloughran commented on PR #2584: URL: https://github.com/apache/hadoop/pull/2584#issuecomment-1090308409
really need reviews of this @mukund-thakur @mehakmeet @bibinchundatt @dannycjones @surendralilhore This patch needs to go in before any other input stream optimisations so that 1. we can cut that HEAD request overhead on small files 2. distcp and fsshell can tell the streams that they are reading the whole file, so they should do big reads and expect no backwards seek. 3. parquet and orc libs can switch to this to get although #2975 sets it up, this PR doesn't include abfs in handling the file length option as an alternative to the file status. I've looked at it but need a plan about etag tracking. we will have to replicate the bit in the s3a code where the first GET's etag is picked up and used from then on. A future piece of work. This PR does contain the tests that are needed there though... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
