On Thu, Jul 28, 2022 at 9:32 PM Melih Mutlu <m.melihmu...@gmail.com> wrote: >> >> Why after step 4, do you need to drop the replication slot? Won't just >> clearing the required info from the catalog be sufficient? > > > The replication slots that we read from the catalog will not be used for > anything else after we're done with syncing the table which the rep slot > belongs to. > It's removed from the catalog when the sync is completed and it basically > becomes a slot that is not linked to any table or worker. That's why I think > it should be dropped rather than left behind. > > Note that if a worker dies and its replication slot continues to exist, that > slot will only be used to complete the sync process of the one table that the > dead worker was syncing but couldn't finish. > When that particular table is synced and becomes ready, the replication slot > has no use anymore. >
Why can't it be used to sync the other tables if any? >> >> Hmm, I think even if there is an iota of a chance which I think is >> there, we can't use worker_pid. Assume, that if the same worker_pid is >> assigned to another worker once the worker using it got an error out, >> the new worker will fail as soon as it will try to create a >> replication slot. > > > Right. If something like that happens, worker will fail without doing > anything. Then a new one will be launched and that one will continue to do > the work. > The worst case might be having conflicting pid over and over again while also > having replication slots whose name includes one of those pids still exist. > It seems unlikely but possible, yes. > >> >> I feel it would be better or maybe we need to think of some other >> identifier but one thing we need to think about before using a 64-bit >> unique identifier here is how will we retrieve its last used value >> after restart of server. We may need to store it in a persistent way >> somewhere. > > > We might consider storing this info in a catalog again. Since this last used > value will be different for each subscription, pg_subscription can be a good > place to keep that. > This sounds reasonable. Let's do this unless we get some better idea. >> >> The problems will be similar to the slot name. The origin is used to >> track the progress of replication, so, if we use the wrong origin name >> after the restart, it can send the wrong start_streaming position to >> the publisher. > > > I understand. But origin naming logic is still the same. Its format is like > pg_<subid>_<relid> . > I did not need to change this since it seems to me origins should belong to > only one table. The patch does not reuse origins. > So I don't think this change introduces an issue with origin. What do you > think? > There is no such restriction that origins should belong to only one table. What makes you think like that? -- With Regards, Amit Kapila.