Re: [PR] HADOOP-19199 [hadoop]

via GitHub Tue, 10 Dec 2024 06:38:33 -0800


steveloughran commented on PR #6877:
URL: https://github.com/apache/hadoop/pull/6877#issuecomment-2531814999


   FYI parquet trunk now uses openFile() with a file status and declared read 
policy "parquet, vector, random", so all hadoop releases >= 3.3.0 will at least 
use random S3 IO; 3.4.0/3.4.1 uses vector IO and 3.4.2 may use parquet specific 
code paths.
   
   This will come in parquet 15.1, leaving Avro and ORC as the next targets.
   
   Please grab and test that parquet beta release to make sure it does what you 
expect with S3 and Azure both reducing a HEAD per file


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HADOOP-19199 [hadoop]

Reply via email to