On Thu, Jan 23, 2025 at 3:24 AM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > On Thu, Jan 23, 2025 at 6:16 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Fri, Jan 10, 2025 at 12:33 AM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > On Thu, Jan 9, 2025 at 3:29 AM Ashutosh Bapat > > > <ashutosh.bapat....@gmail.com> wrote: > > > > > > > > On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada > > > > <sawada.m...@gmail.com> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > Logical decoding (and logical replication) are available only when > > > > > wal_level = logical. As the documentation says[1], Using the 'logical' > > > > > level increases the WAL volume which could negatively affect the > > > > > performance. For that reason, users might want to start with using > > > > > 'replica', but when they want to use logical decoding they need a > > > > > server restart to increase wal_level to 'logical'. My goal is to allow > > > > > users who are using 'replica' level to use logical decoding without a > > > > > server restart. There are other GUC parameters related to logical > > > > > decoding and logical replication such as max_wal_senders, > > > > > max_logical_replication_workers, and max_replication_slots, but even > > > > > if users set these parameters >0, there would not be a noticeable > > > > > performance impact. And their default values are already >0. So I'd > > > > > like to focus on making only the wal_level dynamic GUC parameter. > > > > > There are several earlier discussions[2][3] but no one has submitted > > > > > patches unless I'm missing something. > > > > > > > > > > The first idea I came up with is to make the wal_level a PGC_SIGHUP > > > > > parameter. However, it affects not only setting 'replica' to 'logical' > > > > > but also setting 'minimal' to 'replica' or higher. I'm not sure the > > > > > latter case is common and it might require a checkpoint. I don't want > > > > > to make the patch complex for uncommon cases. > > > > > > > > > > The second idea is to somehow allow both WAL-logging logical info and > > > > > logical decoding even when wal_level is 'replica'. I've attached a PoC > > > > > patch for that. The patch introduces new SQL functions such as > > > > > pg_activate_logical_decoding() and pg_deactivate_logical_decoding(). > > > > > These functions are available only when wal_level is 'repilca'(or > > > > > higher). In pg_activate_logical_decoding(), we set the status of > > > > > logical decoding stored on the shared memory from 'disabled' to > > > > > 'xlog-logical-info', allowing all processes to write logical > > > > > information to WAL records for logical decoding. But the logical > > > > > decoding is still not allowed. Once we confirm all in-progress > > > > > transactions completed, we switch the status to > > > > > 'logical-decoding-ready', meaning that users can create logical > > > > > replication slots and use logical decoding. > > > > > > > > > > Overall, with the patch, there are two ways to enable logical > > > > > decoding: setting wal_level to 'logical' and calling > > > > > pg_activate_logical_decoding() when wal_level is 'replica'. I left the > > > > > 'logical' level for backward compatibility and for users who want to > > > > > enable the logical decoding without calling that SQL function. If we > > > > > can automatically enable the logical decoding when creating the first > > > > > logical replication slot, probably we no longer need the 'logical' > > > > > level. There is room to discuss the user interface. Feedback is very > > > > > welcome. > > > > > > > > > > > > > If a server is running at minimal wal_level and they want to enable > > > > logical replication, they would still need a server restart. That > > > > would be rare but not completely absent. > > > > > > Currently we don't allow the server to start with the 'minimal' level > > > and max_wal_senders > 0. Even if we support changing 'minimal' to > > > 'logical' without a server restart, we still need a server restart to > > > increase max_wal_senders for users who want to use logical > > > replication. Or we need to eliminate this restriction too. I guess it > > > would be too complex for such uncommon use cases. > > > > > > > > > > > Our documentation says "wal_level determines how much information is > > > > written to the WAL.". Users would may not expect that the WAL amount > > > > changes while wal_level = replica depending upon whether logical > > > > decoding is possible. It may be possible to set the expectations right > > > > by changing the documentation. It's not in the patch, so I am not sure > > > > whether this is considered. > > > > > > We should mention that in the doc. The WAL amount changes depending on > > > not only wal_level but also other parameters such as wal_log_hints and > > > full_page_writes. > > > > > > > Cloud providers do not like multiple ways of changing configuration > > > > esp. when they can not control it. See [1]. Changing wal_level through > > > > a SQL function may fit the same category. > > > > > > Thank you for pointing it out. This would support the idea of > > > automatically enabling logical decoding. > > > > > > > I agree that it would be a lot of work to make all combinations of > > > > wal_level changes work, but changing wal_level through SIGHUP looks > > > > like a cleaner solution. Is there way that we make the GUC SIGHUP but > > > > disallow certain combinations of old and new values? > > > > > > While I agree that it's cleaner I think there is no way today. I think > > > we need to invent something for that. > > > > I would like to summarize the proposed approaches thus far: > > > > Regarding the user interface, there are three approaches: > > > > 1. Implementing SQL function controls (e.g., > > pg_activate_logical_decoding() and pg_deactivate_logical_decoding()). > > This would enable users to activate logical decoding even with > > wal_level=replica by calling the SQL function. While cloud providers > > seem not like having multiple configuration methods, this could > > potentially be managed through appropriate EXECUTE privileges. Another > > drawback is the user confusion when 'SHOW wal_level' displays > > 'replica' despite processes writing WAL records with logical > > information. This might be dealt with by implementing a show_hook > > function for wal_level. > > > > 2. Implementing automatic logical decoding activation. This would > > trigger upon creation of the first logical slot and deactivate upon > > removal of the final slot. This approach shares the user confusion > > concern of the first proposal. Moreover, it presents a significant > > limitation: users would be unable to utilize logical decoding on > > standby servers without maintaining at least one logical slot on the > > primary -- a substantial disadvantage. > > > > 3. Converting wal_level to a SIGHUP parameter, thereby supporting all > > possible wal_level transition combinations. While this represents the > > most elegant solution among the proposals, it necessitates additional > > development effort for less common scenarios, such as transitioning > > between 'minimal' and 'replica' levels. Such transitions require > > specific handling -- for instance, changing between 'minimal' and > > 'replica' requires a checkpoint, while decreasing from 'replica' to > > 'minimal' necessitates terminating certain processes like WAL senders > > and archiver. > > > > We also had discussion (and I did some research) on the implementation > > of increasing/decreasing wal_level online. The basic idea is that we > > first enable logical information WAL-logging to all processes while > > maintaining the logical decoding in an inactive state. Once we can > > guarantee that all processes are writing WAL records with logical > > information, we enable the logical decoding. This guarantee can be > > achieved by waiting for all concurrent transactions to finish, which > > could make us wait for a long time if a transaction is long-running. > > Another way is to send a global barrier signal and wait for all > > processes to start writing WAL records with logical information. We > > have a good facility for that: EmitProcSignalBarrier() and > > WaitForProcSignalBarrier(). That way, we don't need to wait for > > transaction finishes. > > > > Based on the discussion so far, the idea 3 appears most promising. I > > welcome any additional suggestions or preferences. > > I think this is the cleanest solution but harder to implement. > Performing any heavy lifting like waiting for other transactions to > finish or a barrier inside pg_reload_conf() increases the chances of > delaying conf reload. Further, if there are errors, it might cause > some configurations to be not loaded. So the actual processing needs > to happen after the configurations have been loaded.
Here is an idea: we can use a background worker who does everything for changing wal_level online. I thought we could delegate the work for changing wal_level to the checkpointer process, but given that the work involves creating a checkpoint, invalidating logical slots, and terminating (physical) walsenders etc I think it would be better to have a dedicated worker for that. I've attached a PoC patch for discussion. It's still dirty, uncommented, and untested enough, but I hope it will help move this project forward. In the patch, when the checkpointer realizes that wal_level has been changed, it launches a background worker, wal-level control worker. The worker changes wal_level online using global barriers etc. and terminates functionality such as WAL archiving, replication and logical decoding if necessary. While it might be too much to use a background worker just for changing wal_level online, I think it's technically okay. Users would need to have one open background worker slot, or we can implement it using an auxiliary process. Such a worker process might be able to be used for other parameters too in the future. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
v1-0001-PoC-Convert-wal_level-a-PGC_SIGHUP-parameter.patch
Description: Binary data