Pádraig Brady wrote:
...
> Wow that's interesting. My results are with 400MHz DDR2.
> If I do a simpler test excluding file-system and page cache
> to just show the syscall overhead I can also see the doubling
> of throughput when going from 4KiB to 32KiB buffers:
>
> for i in $(seq 0 10); do
> bs=$((1024*2**$i))
> printf "%7s=" $bs
> dd bs=$bs if=/dev/zero of=/dev/null count=$(((2*1024**3)/$bs)) 2>&1 |
> sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
> done
> 1024=484 MB/s
> 2048=857 MB/s
> 4096=1.6 GB/s
> 8192=2.4 GB/s
> 16384=3.1 GB/s
> 32768=3.6 GB/s
> 65536=3.6 GB/s
> 131072=3.8 GB/s
> 262144=3.9 GB/s
> 524288=3.9 GB/s
> 1048576=3.9 GB/s
>
> Why I only see a small increase between 4 & 32K buffers when going
> through the file-system and page cache on my kernel, must be due to
> inefficiencies that have subsequently been addressed?
Interesting test.
On the 2-core AMD system (1MB cache per core)
$ for i in $(seq 0 10); do
bs=$((1024*2**$i))
printf "%7s=" $bs
dd bs=$bs if=/dev/zero of=/dev/null count=$(((2*1024**3)/$bs)) 2>&1 |
sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
done
1024=578 MB/s
2048=1.1 GB/s
4096=1.8 GB/s
8192=2.6 GB/s
16384=3.2 GB/s
32768=4.1 GB/s
65536=4.8 GB/s
131072=5.2 GB/s
262144=5.7 GB/s
524288=5.9 GB/s
1048576=3.4 GB/s
On the 4-core Intel with 6M cache per core and faster RAM
1024=1.5 GB/s
2048=2.8 GB/s
4096=5.0 GB/s
8192=7.7 GB/s
16384=10.4 GB/s
32768=9.6 GB/s
65536=9.9 GB/s
131072=10.6 GB/s
262144=10.7 GB/s
524288=10.6 GB/s
1048576=11.2 GB/s
2097152=10.6 GB/s
4194304=9.8 GB/s
8388608=2.6 GB/s
_______________________________________________
Bug-coreutils mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-coreutils