Hi, On 2026-02-17 22:36:53 +0100, Tomas Vondra wrote: > On 2/17/26 21:16, Peter Geoghegan wrote: > > On Tue, Feb 17, 2026 at 2:27 PM Andres Freund <[email protected]> wrote: > >> On 2026-02-17 12:16:23 -0500, Peter Geoghegan wrote: > >>> On Mon, Feb 16, 2026 at 11:48 AM Andres Freund <[email protected]> wrote: > >>> I agree that the current heuristics (which were invented recently) are > >>> too conservative. I overfit the heuristics to my current set of > >>> adversarial queries, as a stopgap measure. > >> > >> Are you doing any testing on higher latency storage? I found it to be > >> quite > >> valuable to use dm_delay to have a disk with reproducible (i.e. not cloud) > >> higher latency (i.e. not just a local SSD). > > > > I sometimes use dm_delay (with the minimum 1ms delay) when testing, > > but don't do so regularly. Just because it's inconvenient to do so > > (perhaps not a great reason). > > > >> Low latency NVMe can reduce the > >> penalty of not enough readahead so much that it's hard to spot problems... > > > > I'll keep that in mind. > > > > So, what counts as "higher latency" in this context? What delays should > we consider practical/relevant for testing?
0.5-4ms is the range I've seen in various clouds across their reasonable storage products (i.e. not spinning disks or other ver bulk oriented things). Unfortunately dm_delay doesn't support < 1ms delays, but it's still much better than nothing. I've been wondering about teaching AIO to delay IOs (by adding a sleep to workers and linking a IORING_OP_TIMEOUT submission with the actually intended IO) to allow testing smaller delays. > > That would make sense. You can already tell when that's happened by > > comparing the details shown by EXPLAIN ANALYZE against the same query > > execution on master, but that approach is inconvenient. Automating my > > microbenchmarks has proven to be important with this project. There's > > quite a few competing considerations, and it's too easy to improve one > > query at the cost of regressing another. > > > > What counts as "unconsumed IO"? The IOs the stream already started, but > then did not consume? That shouldn't be hard, I think. Yes, the number of IOs that were started but not consumed. Or, even better, the number of IOs that completed but were not consumed - but that'd be harder to get right now. I agree that started-but-not-consumed should be pretty easy. Greetings, Andres Freund
