westonpace opened a new pull request #12150:
URL: https://github.com/apache/arrow/pull/12150


   
    * The benchmark named ReadFile is misleading since it is actually reading 
from an in-memory buffer and no OS "read" call is ever issued.
    * Renamed ReadTempFile to ReadCachedFile and added a second case for 
ReadUncachedFile. The former reads a file in the OS' page cache and the latter 
forces a read to actually hit the disk.
    * The TempFile benchmarks were not actually writing the correct amount of 
data and were reporting unrealistically high rates as a result.
    * Adding a "partial read" parameter which, when true, only reads 1/8 the 
columns in the file so we can see the impact of pushdown projection.
    * Slightly reduced the range of parameters to keep the benchmark time 
reasonable (8k columns wasn't telling us anything more than 4k columns).
   
   NOTE: This PR will invalidate some previous results from 
arrow-ipc-read-write-benchmark, disrupting conbench & other monitoring efforts. 
 This is because those previous results were wrong.
   
   It also likely invalidates even more arrow-ipc-read-write-benchmark results 
because we added a new parameter and renamed some of the benchmarks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to