Hi,

On 2025-10-07 16:23:36 +0200, Tomas Vondra wrote:
> On 10/7/25 14:08, Tomas Vondra wrote:
> > ...
> >>>>>> I think doing this kind of measurement via normal SQL query processing 
> >>>>>> is
> >>>>>> almost always going to have too much other influences. I'd measure 
> >>>>>> using fio
> >>>>>> or such instead.  It'd be interesting to see fio numbers for your 
> >>>>>> disks...
> >>>>>>
> >>>>>> fio --directory /srv/fio --size=8GiB --name test --invalidate=0 
> >>>>>> --bs=$((8*1024)) --rw read --buffered 0 --time_based=1 --runtime=5 
> >>>>>> --ioengine pvsync  --iodepth 1
> >>>>>> vs --rw randread
> >>>>>>
> >>>>>> gives me 51k/11k for sequential/rand on one SSD and 92k/8.7k for 
> >>>>>> another.
> >>>>>>
> >>>>>
> >>>>> I can give it a try. But do we really want to strip "our" overhead with
> >>>>> reading data?
> >
> > I got this on the two RAID devices (NVMe and SATA):
> >
> > NVMe: 83.5k / 15.8k
> > SATA: 28.6k /  8.5k
> >
> > So the same ballpark / ratio as your test. Not surprising, really.
> >
>
> FWIW I do see about this number in iostat. There's a 500M test running
> right now, and iostat reports this:
>
>   Device      r/s     rkB/s  ...  rareq-sz  ...  %util
>   md1    15273.10 143512.80  ...      9.40  ...  93.64
>
> So it's not like we're issuing far fewer I/Os than the SSD can handle.

Not really related to this thread:

IME iostat's utilization is pretty much useless for anything other than "is
something happening at all", and even that is not reliable. I don't know the
full reason for it, but I long learned to just discount it.

I ran
fio --directory /srv/fio --size=8GiB --name test --invalidate=0 
--bs=$((8*1024)) --rw read --buffered 0 --time_based=1 --runtime=100 --ioengine 
pvsync  --iodepth 1 --rate_iops=40000

a few times in a row, while watching iostat. Sometimes utilization is 100%,
sometimes it's 0.2%.  Whereas if I run without rate limiting, utilization
never goes above 71%, despite doing more iops.


And then gets completely useless if you use a deeper iodepth, because there's
just not a good way to compute something like a utilization number once
you take parallel IO processing into account.

fio --directory /srv/fio --size=8GiB --name test --invalidate=0 
--bs=$((8*1024)) --rw read --buffered 0 --time_based=1 --runtime=100 --ioengine 
io_uring  --iodepth 1 --rw randread
iodepth         util    iops
1               94%     9.3k
2               99.6%   18.4k
4               100%    35.9k
8               100%    68.0k
16              100%    123k

Greetings,

Andres Freund


Reply via email to