On Thu, Jul 5, 2018 at 4:39 PM, Andres Freund <and...@anarazel.de> wrote: > This is formulated *WAY* too positive. It'll have dramatic *NEGATIVE* > performance impact of non COW filesystems, and very likely even negative > impacts in a number of COWed scenarios (when there's enough memory to > keep all WAL files in memory). > > I still think that fixing this another way would be preferrable. This'll > be too much of a magic knob that depends on the fs, hardware and > workload.
I tend to agree with you, but unless we have a pretty good idea what that other way would be, I think we should probably accept the patch. Could we somehow make this self-tuning? On any given filesystem/hardware/workload, either creating a new 16MB file is faster, or recycling an old file is faster. If the old file is still cached, recycling it figures to win on almost any hardware. If not, it seems like something of a toss-up. I suppose we could try to keep a running average of how long it is taking us to recycle WAL files and how long it is taking us to create new ones; if we do each one of those things at least sometimes, then we'll eventually get an idea of which one is quicker. But it's not clear to me that such data would be very reliable unless we tried to make sure that we tried both things fairly regularly under circumstances where we could have chosen to do the other one. I think part of the problem here is that whether a WAL segment is likely to be cached depends on a host of factors which we don't track very carefully, like whether it's been streamed or decoded recently. If we knew when that a particular WAL segment hadn't been accessed for any purpose in 10+ minutes, it would probably be fairly safe to guess that it's no longer in cache; if we knew that it had been accessed <15 seconds ago, that it is probably still in cache. But we have no idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company