On Wed, May 13, 2026 at 10:28 PM Alvaro Herrera <[email protected]> wrote: > > Hello Amit, > > On 2026-May-13, Amit Kapila wrote: > > > So now the question is where do we go from here. I am not confident > > that the current code to achieve db-specific snapshots in logical > > decoding is the best possible solution both because of the drawbacks > > (like we won't be able to enable this on standby) and inefficiencies > > pointed out by me in this and previous emails in this work. > > This is a fair question. I don't think we have time to go much further > on this aspect before beta 1, so we either accept this patch, fix the > inefficiencies you pointed out and keep db-specific snapshots, >
I don't think it would be easy to address these inefficiencies before beta 1. The root of those inefficiencies is that the patch reuses the cluster-wide running_xact WAL infrastructure to log db-specific running transactions, and then tries to feed that into the existing snapbuild machinery to reach a consistent state. As another example of this mismatch that occurred to me today: in SnapBuildCommitTxn, we are tracking the committed_xip array for all cluster-wide XIDs, even when using a db-specific snapshot. A db-specific snapshot shouldn't need to care about XIDs from other databases. We only try to take care of it in one part of the system where process running_xacts record. I admit that I don't know at this stage what exactly we should do about it but all such things deserve a discussion and careful thought. The broader issue is that the entire logical decoding mechanism is designed to process cluster-wide transactions. This patch tries to bypass that foundational assumption, but only during the initial snapshot construction while processing running_xacts record. To be clear, I am not against the idea of db-specific snapshots to enable concurrent repacks. My concern is simply the time required to get the architecture right. In its current state, we need more time to carefully consider how this db-specific concept interacts with the rest of the logical decoding machinery, which is built for cluster-wide records. > or we > revert db-specific snapshots and go back to the standard snapshot-taking > technique for REPACK in 19 and see what we can improve for 20. > > Now, the worst consequence of reverting db-specific snapshots is that > you will only be able to run REPACK in a single database at a time > (because any subsequent REPACK will have to wait until the first one > finishes before being able to get its snapshot). In most normal cases > this is probably not a big deal. But if you have a multitenant system, > and you want your users to be able to run REPACK on their tables, you > may be a bit screwed. So I hesitate to just go and revert it without > offering those people any alternative. > I understand your point but I feel we can extend the current feature in future versions to address such cases (allow REPACK CONCURRENTLY on tables in multiple-databases simultaneously). For now, they may need to rely on REPACK without CONCURRENTLY option, if they want to use it for multiple databases simultaneously. > (It's also possible that being unable to run more than one REPACK at a > time is not so big a deal. After all, it's supposed to be an infrequent > operation. And users probably don't or shouldn't have multi-terabyte > tables in multitenant databases anyway.) > > I'm not sure I understand the point of the standby. I mean, you can't > run REPACK on the standby anyway, so I don't see this as a very > problematic restriction. Do you have other reasons for wanting a > db-specific snapshot in a standby? > We are exposing need_shared_catalogs as a generic plugin option, defined as: 'it can be set to false if one is certain the plugin functions do not access shared system catalogs.' This implies it can be used for purposes other than REPACK. For example, one can imagine a single-database audit plugin that only cares about data modifications within a specific database. By setting need_shared_catalogs = false on a standby, it could reach a CONSISTENT state much faster, perfectly serving its needs. While such a plugin might not exist right now, my broader point is this: when we expose a generic facility, it can and will be used in ways beyond our initial core use cases. We should try to ensure the design doesn't permanently preclude such extensions. With the current design choice, we are painting ourselves into a corner where this feature cannot easily be extended to standbys even in the future. -- With Regards, Amit Kapila.
