Dear Tomas, Euler, Thanks for starting to read the thread! Since I'm not an original author, I want to reply partially.
> I decided to take a quick look on this patch today, to see how it works > and do some simple tests. I've only started to get familiar with it, so > I have only some comments / questions regarding usage, not on the code. > It's quite possible I didn't understand some finer points, or maybe it > was already discussed earlier in this very long thread, so please feel > free to push back or point me to the past discussion. > > Also, some of this is rather opinionated, but considering I didn't see > this patch before, my opinions may easily be wrong ... I felt your comments were quit valuable. > 1) SGML docs > > It seems the SGML docs are more about explaining how this works on the > inside, rather than how to use the tool. Maybe that's intentional, but > as someone who didn't work with pg_createsubscriber before I found it > confusing and not very helpful. > > For example, the first half of the page is prerequisities+warning, and > sure those are useful details, but prerequisities are checked by the > tool (so I can't really miss this) and warnings go into a lot of details > about different places where things may go wrong. Sure, worth knowing > and including in the docs, but maybe not right at the beginning, before > I learn how to even run the tool? Hmm, right. I considered below improvements. Tomas and Euler, how do you think? * Adds more descriptions in "Description" section. * Moves prerequisities+warning to "Notes" section. * Adds "Usage" section which describes from a single node. > I'm not sure FOR ALL TABLES is a good idea. Or said differently, I'm > sure it won't work for a number of use cases. I know large databases > it's common to create "work tables" (not necessarily temporary) as part > of a batch job, but there's no need to replicate those tables. Indeed, the documentation does not describe that all tables in the database would be included in the publication. > I do understand that FOR ALL TABLES is the simplest approach, and for v1 > it may be an acceptable limitation, but maybe it'd be good to also > support restricting which tables should be replicated (e.g. blacklist or > whitelist based on table/schema name?). May not directly related, but we considered that accepting options was a next-step [1]. > Note: I now realize this might fall under the warning about DDL, which > says this: > > Executing DDL commands on the source server while running > pg_createsubscriber is not recommended. If the target server has > already been converted to logical replica, the DDL commands must > not be replicated so an error would occur. Yeah, they would not be replicated, but not lead ERROR. So should we say like "Creating tables on the source server..."? > 5) slot / publication / subscription name > > I find it somewhat annoying it's not possible to specify names for > objects created by the tool - replication slots, publication and > subscriptions. If this is meant to be a replica running for a while, > after a while I'll have no idea what pg_createsubscriber_569853 or > pg_createsubscriber_459548_2348239 was meant for. > > This is particularly annoying because renaming these objects later is > either not supported at all (e.g. for replication slots), or may be > quite difficult (e.g. publications). > > I do realize there are challenges with custom names (say, if there are > multiple databases to replicate), but can't we support some simple > formatting with basic placeholders? So we could specify > > --slot-name "myslot_%d_%p" > > or something like that? Not sure we can do in the first version, but looks nice. One concern is that I cannot find applications which accepts escape strings like log_line_prefix. (It may just because we do not have use-case.) Do you know examples? > BTW what will happen if we convert multiple standbys? Can't they all get > the same slot name (they all have the same database OID, and I'm not > sure how much entropy the PID has)? I tested and the second try did not work. The primal reason was the name of publication - pg_createsubscriber_%u (oid). FYI - previously we can reuse same publications, but based on my comment [2] the feature was removed. It might be too optimistic. [1]: https://www.postgresql.org/message-id/TY3PR01MB9889CCBD4D9DAF8BD2F18541F56F2%40TY3PR01MB9889.jpnprd01.prod.outlook.com [2]: https://www.postgresql.org/message-id/TYCPR01MB12077756323B79042F29DDAEDF54C2%40TYCPR01MB12077.jpnprd01.prod.outlook.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/