On Tue, Mar 21, 2023 at 8:18 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Mar 21, 2023 at 7:32 AM Euler Taveira <eu...@eulerto.com> wrote: > > > > On Mon, Mar 20, 2023, at 10:10 PM, Kumar, Sachin wrote: > > > > > From: Alvaro Herrera <alvhe...@alvh.no-ip.org> > > > Subject: RE: [EXTERNAL]Initial Schema Sync for Logical Replication > > > On 2023-Mar-15, Kumar, Sachin wrote: > > > > > > > 1. In CreateSubscription() when we create replication > > > > slot(walrcv_create_slot()), should use CRS_EXPORT_SNAPSHOT, So that we > > > can use this snapshot later in the pg_dump. > > > > > > > > 2. Now we can call pg_dump with above snapshot from CreateSubscription. > > > > > > Overall I'm not on board with the idea that logical replication would > > > depend on > > > pg_dump; that seems like it could run into all sorts of trouble (what if > > > calling > > > external binaries requires additional security setup? what about pg_hba > > > connection requirements? what about max_connections in tight > > > circumstances?). > > > what if calling external binaries requires additional security setup > > I am not sure what kind of security restriction would apply in this case, > > maybe pg_dump > > binary can be changed ? > > > > Using pg_dump as part of this implementation is not acceptable because we > > expect the backend to be decoupled from the client. Besides that, pg_dump > > provides all table dependencies (such as tablespaces, privileges, security > > labels, comments); not all dependencies shouldn't be replicated. > > > > I agree that in the initial version we may not support sync of all > objects but why that shouldn't be possible in the later versions? > > > You should > > exclude them removing these objects from the TOC before running pg_restore > > or > > adding a few pg_dump options to exclude these objects. Another issue is > > related > > to different version. Let's say the publisher has a version ahead of the > > subscriber version, a new table syntax can easily break your logical > > replication setup. IMO pg_dump doesn't seem like a good solution for initial > > synchronization. > > > > Instead, the backend should provide infrastructure to obtain the required > > DDL > > commands for the specific (set of) tables. This can work around the issues > > from > > the previous paragraph: > > > ... > > * don't need to worry about different versions. > > > > AFAICU some of the reasons why pg_dump is not allowed to dump from the > newer version are as follows: (a) there could be more columns in the > newer version of the system catalog and then Select * type of stuff > won't work because the client won't have knowledge of additional > columns. (b) the newer version could have new features (represented by > say new columns in existing catalogs or new catalogs) that the older > version of pg_dump has no knowledge of and will fail to get that data > and hence an inconsistent dump. The subscriber will easily be not in > sync due to that. > > Now, how do we avoid these problems even if we have our own version of > functionality similar to pg_dump for selected objects? I guess we will > face similar problems.
Right. I think that such functionality needs to return DDL commands that can be executed on the requested version. > If so, we may need to deny schema sync in any such case. Yes. Do we have any concrete use case where the subscriber is an older version, in the first place? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com