On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand > <bertranddrouvot...@gmail.com> wrote: > >
> 3. As mentioned in the initial email, I think it would be better to > replace LIST_SLOTS command with a SELECT query. > I had a look at this thread. I am interested to work on this and can spend some time addressing the comments given here. I tried to replace LIST_SLOTS command with a SELECT query. Attached rebased patch and PoC patch for LIST_SLOTS removal. For LIST_SLOTs cmd removal, below are the points where more analysis is needed. 1) I could not use the exposed libpqwalreceiver's functions walrcv_exec/libpqrcv_exec in LogicalRepLauncher to run select query instead of LIST_SLOTS cmd. This is because libpqrcv_exec() needs database connection but since in LogicalReplauncher, we do not have any (MyDatabseId is not set), so the API gives an error. Thus to make it work for the time-being, I used 'libpqrcv_PQexec' which is not dependent upon database connection. But since it is not exposed "yet" to other layers, I temporarily added the new code to libpqwalreceiver.c itself. In fact I reused the existing function wrapper libpqrcv_list_slots and changed the functionality to get info using select query rather than list_slots. 2) While using connect API walrcv_connect/libpqrcv_connect(), we need to tell it whether it is for logical or physical replication. In the existing patch, where we were using LIST_SLOTS cmd, we have this connection made with logical=false. But now since we need to run select query to get the same info, using connection with logical=false gives error on primary while executing select query. "ERROR: cannot execute SQL commands in WAL sender for physical replication". And thus in ApplyLauncherStartSlotSync(), I have changed connect API to use logical=true for the time being. I noticed that in the existing patch, it was using logical=false in ApplyLauncherStartSlotSync() while logical=true in synchronize_slots(). Possibly due to the same fact that logical=false connection will not allow synchronize_slots() to run select query on primary while it worked for ApplyLauncherStartSlotSync() as it was running list_slots cmd instead of select query. I am exploring further on these points to figure out which one is the correct way to deal with these. Meanwhile posting this WIP patch for early feedback. I will try addressing other comments as well in next versions. thanks Shveta
v1-0001-Remove-list_slots-command.patch
Description: Binary data
v7-0001-Synchronize-logical-replication-slots-from-primar.patch
Description: Binary data