westonpace opened a new pull request #10485:
URL: https://github.com/apache/arrow/pull/10485
Here's a first stab at something. Current benchmark numbers:
Cold I/O
```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark
Time CPU Iterations
UserCounters...
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ColdReadFromInputStreamViaIterator/nbytes:67108864/nread:16/iterations:3/real_time
1624697050 ns 1592154889 ns 3
bytes_per_second=39.392M/s
ColdReadFromInputStreamViaIterator/nbytes:67108864/nread:1024/iterations:3/real_time
479957725 ns 187154245 ns 3
bytes_per_second=133.345M/s
ColdReadFromInputStreamViaIterator/nbytes:67108864/nread:16384/iterations:3/real_time
446677330 ns 146834789 ns 3
bytes_per_second=143.28M/s
ColdReadFromInputStreamViaIterator/nbytes:67108864/nread:1048576/iterations:3/real_time
433891566 ns 87376892 ns 3
bytes_per_second=147.502M/s
ColdReadFromReadableFileViaIterator/nbytes:67108864/nread:16/iterations:3/real_time
1608131394 ns 1577053815 ns 3
bytes_per_second=39.7977M/s
ColdReadFromReadableFileViaIterator/nbytes:67108864/nread:1024/iterations:3/real_time
468960464 ns 225496108 ns 3
bytes_per_second=136.472M/s
ColdReadFromReadableFileViaIterator/nbytes:67108864/nread:16384/iterations:3/real_time
481367465 ns 146945451 ns 3
bytes_per_second=132.955M/s
ColdReadFromReadableFileViaIterator/nbytes:67108864/nread:1048576/iterations:3/real_time
452987869 ns 101980029 ns 3
bytes_per_second=141.284M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16/blksz:16384/iterations:3/real_time
810107744 ns 715216286 ns 3 bytes_per_second=79.0018M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1024/blksz:16384/iterations:3/real_time
464428224 ns 184081279 ns 3 bytes_per_second=137.804M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16384/blksz:16384/iterations:3/real_time
476477704 ns 130752605 ns 3 bytes_per_second=134.319M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1048576/blksz:16384/iterations:3/real_time
492571110 ns 105700699 ns 3 bytes_per_second=129.93M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16/blksz:1048576/iterations:3/real_time
838470375 ns 704115642 ns 3 bytes_per_second=76.3295M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1024/blksz:1048576/iterations:3/real_time
456894401 ns 146226591 ns 3 bytes_per_second=140.076M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16384/blksz:1048576/iterations:3/real_time
428668434 ns 120917238 ns 3 bytes_per_second=149.3M/s
ColdReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1048576/blksz:1048576/iterations:3/real_time
434027576 ns 101352042 ns 3 bytes_per_second=147.456M/s
```
Hot I/O
```
HotReadFromInputStreamViaIterator/nbytes:67108864/nread:16/real_time
1552992560 ns 1552833834 ns 1
bytes_per_second=41.2108M/s
HotReadFromInputStreamViaIterator/nbytes:67108864/nread:1024/real_time
38005116 ns 38004764 ns 18
bytes_per_second=1.64452G/s
HotReadFromInputStreamViaIterator/nbytes:67108864/nread:16384/real_time
18285984 ns 18285286 ns 37
bytes_per_second=3.41792G/s
HotReadFromInputStreamViaIterator/nbytes:67108864/nread:1048576/real_time
15394076 ns 15394409 ns 44
bytes_per_second=4.06G/s
HotReadFromInputStreamViaIterator/nbytes:67108864/nread:4194304/real_time
16003816 ns 16002276 ns 44
bytes_per_second=3.90532G/s
HotReadFromReadableFileViaIterator/nbytes:67108864/nread:16/real_time
1527480000 ns 1527427622 ns 1
bytes_per_second=41.8991M/s
HotReadFromReadableFileViaIterator/nbytes:67108864/nread:1024/real_time
38129805 ns 38129378 ns 18
bytes_per_second=1.63914G/s
HotReadFromReadableFileViaIterator/nbytes:67108864/nread:16384/real_time
18382230 ns 18381108 ns 38
bytes_per_second=3.40002G/s
HotReadFromReadableFileViaIterator/nbytes:67108864/nread:1048576/real_time
15640630 ns 15640457 ns 43
bytes_per_second=3.996G/s
HotReadFromReadableFileViaIterator/nbytes:67108864/nread:4194304/real_time
15744478 ns 15744727 ns 44
bytes_per_second=3.96965G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16/blksz:16384/real_time
661272360 ns 661239849 ns 1 bytes_per_second=96.7831M/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1024/blksz:16384/real_time
27069592 ns 27068959 ns 26 bytes_per_second=2.30886G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16384/blksz:16384/real_time
18922845 ns 18921447 ns 37 bytes_per_second=3.30289G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1048576/blksz:16384/real_time
15559741 ns 15559852 ns 44 bytes_per_second=4.01678G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:4194304/blksz:16384/real_time
15863462 ns 15862226 ns 44 bytes_per_second=3.93987G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16/blksz:1048576/real_time
659122700 ns 659069179 ns 1 bytes_per_second=97.0988M/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1024/blksz:1048576/real_time
26847778 ns 26847537 ns 26 bytes_per_second=2.32794G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:16384/blksz:1048576/real_time
20293484 ns 20293652 ns 34 bytes_per_second=3.07981G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:1048576/blksz:1048576/real_time
15491881 ns 15491085 ns 44 bytes_per_second=4.03437G/s
HotReadFromBufferedInputStreamViaIterator/nbytes:67108864/nread:4194304/blksz:1048576/real_time
15856990 ns 15857431 ns 44 bytes_per_second=3.94148G/s
```
It seems the buffered iterator does improve some things but may have room
for improvement. I don't see any speedup at all from calling fadvise.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]