Hi, On 2025-10-08 13:23:33 -0400, Robert Haas wrote: > On Wed, Oct 8, 2025 at 12:24 PM Tomas Vondra <[email protected]> wrote: > > Isn't this somewhat what effective_cache_size was meant to do? That > > obviously does not know about what fraction of individual tables is > > cached, but it does impose size limit. > > Not really, because effective_cache_size only models the fact that > when you iterate the same index scan within the execution of a single > query, it will probably hit some pages more than once.
That's indeed today's use, but I wonder whether we ought to expand that. One of the annoying things about *_page_cost effectively needing to be set "too low" to handle caching effects is that that completely breaks down for larger relations. Which has unwelcome effects like making a > memory sequential scan seem like a reasonable plan. It's a generally reasonable assumption that a scan processing a smaller amount of data than effective_cache_size is more likely to cached than a scan that is processing much more data than effective_cache_size. In the latter case, assuming an accurate effective_cache_size, we *know* that a good portion of the data cannot be cached. Which leads me to wonder if we ought to interpolate between a "cheaper" access cost for data << effective_cache_size and the "more real" access costs the closer the amount of data gets to effective_cache_size. Greetings, Andres Freund
