parthchandra commented on PR #1139: URL: https://github.com/apache/parquet-mr/pull/1139#issuecomment-1766778525
@ahmarsuhail No these numbers are not with iceberg and S3FileIO. I used a modified (lots of stuff removed) version of the ParquetFileReader and a custom benchmark program that reads all the row groups in parallel and records the time spent in each read from S3. The modified version of ParquetFileReader can switch between the various methods of reading from S3. The entry `AWS SDK V2` is a near copy of the Iceberg S3FileIO code though. I saw issues with the CRT client when running at scale causing JVM crashes. And the V2 transfer manager did not do range reads properly. Do share your experience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org