> I have a very serious concern about the current patch set. as someone who has > faced transaction id wraparound in the past. > > I can start by saying I think it would be helpful (if the other issues are > approached reasonably) to have 64-bit xids, but there is an important piece > of context in reventing xid wraparounds that seems missing from this patch > unless I missed something. > > XID wraparound is a symptom, not an underlying problem. It usually occurs > when autovacuum or other vacuum strategies have unexpected stalls and > therefore fail to work as expected. Shifting to 64-bit XIDs dramatically > changes the sorts of problems that these stalls are likely to pose to > operational teams. -- you can find you are running out of storage rather > than facing an imminent database shutdown. Worse, this patch delays the > problem until some (possibly far later!) time, when vacuum will take far > longer to finish, and options for resolving the problem are diminished. As a > result I am concerned that merely changing xids from 32-bit to 64-bit will > lead to a smaller number of far more serious outages. > > What would make a big difference from my perspective would be to combine this > with an inverse system for warning that there is a problem, allowing the > administrator to throw warnings about xids since last vacuum, with a > configurable threshold. We could have this at two billion by default as that > would pose operational warnings not much later than we have now. > > Otherwise I can imagine cases where instead of 30 hours to vacuum a table, it > takes 300 hours on a database that is short on space. And I would not want > to be facing such a situation.
Hi, Chris! I had a similar stance when I started working on this patch. Of course, it seemed horrible just to postpone the consequences of inadequate monitoring, too long running transactions that prevent aggressive autovacuum etc. So I can understand your point. With time I've got to a little bit of another view of this feature i.e. 1. It's important to correctly set monitoring, the cut-off of long transactions, etc. anyway. It's not the responsibility of vacuum before wraparound to report inadequate monitoring etc. Furthermore, in real life, this will be already too late if it prevents 32-bit wraparound and invokes much downtime in an unexpected moment of time if it occurs already. (The rough analogy for that is the machine running at 120mph turns every control off and applies full brakes just because the cooling liquid is low (of course there might be a warning previously, but anyway)) 2. The checks and handlers for the event that is never expected in the cluster lifetime (~200 years at constant rate of 1e6 TPS) can be just dropped. Of course we still need to do automatic routine maintenance like cutting SLRU buffers (but with a much bigger interval if we have much disk space e.g.). But I considered that we either can not care what will be with cluster after > 200 years (it will be migrated many times before this, on many reasons not related to Postgres even for the most conservative owners). So the radical proposal is to drop 64-bit wraparound at all. The most moderate one is just not taking very much care that after 200 years we have more hassle than next month if we haven't set up everything correctly. Next month's pain will be more significant even if it teaches dba something. Big thanks for your view on the general implementation of this feature, anyway. Kind regards, Pavel Borisov. Supabase