On Mon, Aug 28, 2023 at 4:30 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > By low-velocity, do you mean lower overall TPS? In that case, wouldn't you be > less likely to run into xid wraparound and thus need less aggressive > opportunistic freezing?
Yes. But it also means that we've got slack capacity to do extra work now without really hurting anything. If we can leverage that capacity to reduce the pain of future bulk operations, that seems good. When resources are tight, doing speculative work immediately becomes less appealing. It's pretty hard to take such things into account, though. I was just mentioning it. > So, this is where the caveat about absolute number of page freezes > matters. In algorithm A, master only did 57 page freezes (spread across > the various pgbench tables). At the end of the run, 2 pages were still > frozen. I'm increasingly feeling that it's hard to make sense of the ratio. Maybe show the number of pages frozen, the number that are frozen at the end, and the number of pages in the database at the end of the run as three separate values. > (1) seems bad to me because it doesn't consider whether or not freezing > will be useful -- only if it will be cheap. It froze very little of the > cold data in a workload where a small percentage of it was being > modified (especially workloads A, C, H). And it froze a lot of data in > workloads where it was being uniformly modified (workload B). Sure, but if the cost of being wrong is low, you can be wrong a lot and still be pretty happy. It's unclear to me to what extent we should gamble on making only inexpensive mistakes and to what extent we should gamble on making only infrequent mistakes, but they're both valid strategies. > I suggested (4) and (5) because I think the "older than 33%" threshold > is better than the "older than 10%" threshold. I chose both because I am > still unclear on our values. Are we willing to freeze more aggressively > at the expense of emitting more FPIs? As long as it doesn't affect > throughput? For pretty much all of these workloads, the algorithms which > froze based on page modification recency OR FPI required emitted many > more FPIs than those which froze based only on page modification > recency. Let's assume for a moment that the rate at which the insert LSN is advancing is roughly constant over time, so that it serves as a good proxy for wall-clock time. Consider four tables A, B, C, and D that are, respectively, vacuumed once per minute, once per hour, once per day, and once per week. With a 33% threshold, pages in table A will be frozen if they haven't been modified in 20 seconds, page in table B will be frozen if they haven't been modified in 20 minutes, pages in table C will be frozen if they haven't been modified in 8 hours, and pages in table D will be frozen if they haven't been modified in 2 days, 8 hours. My intuition is that this feels awfully aggressive for A and awfully passive for D. To expand on that: apparently D doesn't actually get much write activity, else there would be more vacuuming happening. So it's very likely that pages in table D are going to get checkpointed and evicted before they get modified again. Freezing them therefore seems like a good bet: it's a lot cheaper to freeze those pages when they're already in shared_buffers and dirty than it is if we have to read and write them specifically for freezing. It's less obvious that what we're doing in table A is wrong, and it could be exactly right, but workloads where a row is modified, a human thinks about something (e.g. whether to complete the proposed purchase), and then the same row is modified again are not uncommon, and human thinking times can easily exceed 20 seconds. On the other hand, workloads where a row is modified, a computer thinks about something, and then the row is modified again are also quite common, and computer thinking times can easily be less than 20 seconds. It feels like a toss-up whether we get it right. For this kind of table, I suspect we'd be happier freezing pages that are about to be evicted or about to be written as part of a checkpoint rather than freezing pages opportunistically in vacuum. Maybe that's something we need to think harder about. If we froze dirty pages that wouldn't need a new FPI just before evicting them, and just before they would be written out for a checkpoint, under what circumstances would we still want vacuum to opportunistically freeze? I think the answer might be "only when it's very cheap." If it's very cheap to freeze now, it's appealing to gamble on doing it before a checkpoint comes along and makes the same operation require an extra FPI. But if it isn't too cheap, then why not just wait and see what happens? If the buffer gets modified again before it gets written out, then freeing immediately is a waste and freeze-on-evict is better. If it doesn't, freezing immediately and freeze-on-evict are the same price. I'm hand-waving a bit here because maybe freeze-on-evict is subject to some conditions and freezing-immediately is more unconditional or something like that. But I think the basic point is valid, namely, sometimes it might be better to defer the decision to the last point in time at which we can reasonably make it, and by focusing on what vacuum (or even HOT pruning) do, we're pulling that decision forward in time, which is good if it makes it cheaper by piggybacking on a previous FPI, but not good if it means that we're guessing whether the page will be accessed again soon when we could just as well wait and see whether that happens or not. -- Robert Haas EDB: http://www.enterprisedb.com