lidavidm commented on pull request #7098: URL: https://github.com/apache/arrow/pull/7098#issuecomment-623645720
Ok, I ran the benchmarks against S3 several times, but performance is wildly inconsistent. Before: ``` ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- MinioFixture/ReadAll500Mib/real_time 5850095808 ns 1881569619 ns 7 85.4687MB/s 0.170937 items/s MinioFixture/ReadChunked500Mib/real_time 7583846744 ns 1568950938 ns 6 65.9296MB/s 0.131859 items/s MinioFixture/ReadCoalesced500Mib/real_time 5935405783 ns 592848 ns 7 84.2402MB/s 0.16848 items/s ``` After: ``` ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- MinioFixture/ReadAll500Mib/real_time 10612223830 ns 2214309641 ns 6 47.1155MB/s 0.094231 items/s MinioFixture/ReadChunked500Mib/real_time 17048801064 ns 3879733068 ns 2 29.3276MB/s 0.0586552 items/s MinioFixture/ReadCoalesced500Mib/real_time 17039251080 ns 655276 ns 2 29.344MB/s 0.058688 items/s ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- MinioFixture/ReadAll500Mib/real_time 5867569374 ns 1152395630 ns 4 85.2142MB/s 0.170428 items/s MinioFixture/ReadChunked500Mib/real_time 6496429473 ns 1172657713 ns 3 76.9654MB/s 0.153931 items/s MinioFixture/ReadCoalesced500Mib/real_time 4892376030 ns 575236 ns 4 102.2MB/s 0.2044 items/s ``` I think S3 performance is too variable/not high enough for this optimization to be noticeable, at least in this context :/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org