On Sun, Feb 22, 2026 at 11:23 AM Alexandre Felipe
<[email protected]> wrote:
> DISTANCE CONTROL
>
> I tested different strategies to increase distance. 2*d, 2*d+1, d+2, d+4, and 
> so on. In my head, what would make sense is d + io_combine_limi, but in the 
> end the 2*d gives the best results across different patterns, e.g. 
> (h{200}m{200}) that seems to be a more reasonable pattern, as previous scans 
> would have loaded in blocks. But these are fundamentally the same, as I 
> posted about this a markov model, and the limit will be something like 
> max_distance * sigmoid(h * (p - p0)), what changes is the transient when we 
> go in and out of a cached region.

I don't understand. Why, in general, would a Markov model be useful
for determining prefetch distance?

> LIMITING PREFETCH
>
> To avoid prefetch waste with a limit node wouldn't it make sense to send from 
> the executor an estimate of how many rows will be required.

There's a patch that does that. Have you looked at the patch series at all?

> I/O REORDERING
>
> I did an experiment reordering the heap accesses, following a zig-zag pattern

There's no question that reordering heap accesses is an interesting
direction to eventually take this infrastructure in. I've experimented
with that myself. But this is the worst possible time to be increasing
the scope of the patch for an uncertain benefit.

We're in crunch mode right now, ahead of feature freeze, which is less
than 6 weeks away. Tomas has been working on this project for about 3
years, and I've been working on it for about 1. Long digressions about
the asymptotic complexity of priority queues add less than zero value.

-- 
Peter Geoghegan


Reply via email to