On 2018-06-19 12:08:27 +0300, Konstantin Knizhnik wrote: > I do not think that prefetching in shared buffers requires much more efforts > and make patch more envasive... > It even somehow simplify it, because there is no to maintain own cache of > prefetched pages...
> But it will definitely have much more impact on Postgres performance: > contention for buffer locks, throwing away pages accessed by read-only > queries,... These arguments seem bogus to me. Otherwise the startup process is going to do that work. > Also there are two points which makes prefetching into shared buffers more > complex: > 1. Need to spawn multiple workers to make prefetch in parallel and somehow > distribute work between them. I'm not even convinced that's true. It doesn't seem insane to have a queue of, say, 128 requests that are done with posix_fadvise WILLNEED, where the oldest requests is read into shared buffers by the prefetcher. And then discarded from the page cache with WONTNEED. I think we're going to want a queue that's sorted in the prefetch process anyway, because there's a high likelihood that we'll otherwise issue prfetch requets for the same pages over and over again. That gets rid of most of the disadvantages: We have backpressure (because the read into shared buffers will block if not yet ready), we'll prevent double buffering, we'll prevent the startup process from doing the victim buffer search. > Concerning WAL perfetch I still have a serious doubt if it is needed at all: > if checkpoint interval is less than size of free memory at the system, then > redo process should not read much. I'm confused. Didn't you propose this? FWIW, there's a significant number of installations where people have observed this problem in practice. > And if checkpoint interval is much larger than OS cache (are there cases > when it is really needed?) Yes, there are. Percentage of FPWs can cause serious problems, as do repeated writouts by the checkpointer. > then quite small patch (as it seems to me now) forcing full page write > when distance between page LSN and current WAL insertion point exceeds > some threshold should eliminate random reads also in this case. I'm pretty sure that that'll hurt a significant number of installations, that set the timeout high, just so they can avoid FPWs. Greetings, Andres Freund