steveloughran commented on PR #2584:
URL: https://github.com/apache/hadoop/pull/2584#issuecomment-1090308409

   really need reviews of this @mukund-thakur @mehakmeet @bibinchundatt 
@dannycjones @surendralilhore
   
   This patch needs to go in before any other input stream optimisations so 
that 
   1. we can cut that HEAD request overhead on small files
   2.  distcp and fsshell can tell the streams that they are reading the whole 
file, so they should do big reads and expect no backwards seek.
   3. parquet and orc libs can switch to this to get 
   
   although #2975 sets it up, this PR doesn't include abfs in handling the file 
length option as an alternative to the file status.
   
   I've looked at it but need a plan about etag tracking. we will have to 
replicate the bit in the s3a code where the first GET's etag is picked up and 
used from then on. A future piece of work. This PR does contain the tests that 
are needed there though...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to