Hi Avinash, Thank you for the detailed explanation.
Indexes were dropped on the destination to increase initial data load speed. We cannot stop the App on source and it is highly transactional. I had thought about this method but I am not sure after the pg_restore from where the logical replication will be started, we cannot afford to lose any data. I will give this method a test though and check how it works. Thanks, Nikhil On Thu, Aug 5, 2021 at 8:42 PM Avinash Kumar <avinash.vallar...@gmail.com> wrote: > Hi, > > On Thu, Aug 5, 2021 at 11:28 AM Vijaykumar Jain < > vijaykumarjain.git...@gmail.com> wrote: > >> On Thu, 5 Aug 2021 at 10:27, Nikhil Shetty <nikhil.db...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Thank you for the suggestion. >>> >>> We tried by dropping indexes and it worked faster compared to what we >>> saw earlier. We wanted to know if anybody has done any other changes that >>> helps speed-up initial data load without dropping indexes. >>> >>> >> You could leverage pg_basbeackup or pg_dump with parallel jobs > taken from a Standby (preferably replication paused if pg_dump, anyways > pg_basebackup should be straight-forward) or taken even from > Primary, for the purpose of initial data load. > > As you are able to drop indexes and make some schema changes, I would > assume that you could pause your app temporarily. If that's the case > you may look into the simple steps i am posting here that demonstrates > pg_dump/pg_restore instead. > > If you cannot pause the app, then, you could look into how you > could use pg_replication_origin_advance > <https://www.postgresql.org/docs/13/functions-admin.html#PG-REPLICATION-ORIGIN-ADVANCE> > > > Step 1 : Pause App > Step 2 : Create Publication on the Primary CREATE PUBLICATION > <some_pub_name> FOR ALL TABLES; > Step 3 : Create Logical Replication Slot on the Primary SELECT * FROM > pg_create_logical_replication_slot('<some_slot_name>', 'pgoutput'); Step > 4 : Create Subscription but do not enable the Subscription > CREATE SUBSCRIPTION <some_sub_name> CONNECTION > 'host=<some_host> dbname=<some_db> user=postgres > password=secret port=5432' PUBLICATION <some_pub_name> > WITH (copy_data = false, create_slot=false, enabled=false, > slot_name=<some_slot_name>); > > Step 5 : Initiate pg_dump. We can take a parallel backup for a faster > restore. > > $ pg_dump -d <some_db> -Fd -j 4 -n <some_schema> -f <some_unique_directory> > -- If its several hundreds of GBs or TBs, you may rather utilize one of > your Standby that has been paused from replication using -> select > pg_wal_replay_pause(); > > Step 6 : Don't need to wait until pg_dump completes, you may start the > App. > -- Hope the app does not perform changes that impact the pg_dump or > gets blocked due to pg_dump. > Step 7 : Restore the dump if you used pg_dump. > pg_restore -d <some_db> -j <some_numer_of_parallel_jobs> <some_directory> Step > 8 : Enable subscription. > ALTER SUBSCRIPTION <some_sub_name> ENABLE; > > If you have not stopped your app then you must advance the lsn using > pg_replication_origin_advance > <https://www.postgresql.org/docs/13/functions-admin.html#PG-REPLICATION-ORIGIN-ADVANCE> > > > These are all hand-written steps while drafting this email, so, > please test it on your end as some typos or adjustments are definitely > expected. > > PS: i have not tested this in production level loads, it was just some exp >> i did on my laptop. >> >> one option would be to use pglogical extension (this was shared by >> Dharmendra in one the previous mails, sharing the same), >> and then use pglogical_create_subscriber cli to create the initial copy >> via pgbasebackup and then carry on from there. >> I ran the test case similar to one below in my local env, and it seems to >> work fine. of course i do not have TB worth of load to test, but it looks >> promising, >> especially since they introduced it to the core. >> pglogical/010_pglogical_create_subscriber.pl at REL2_x_STABLE · >> 2ndQuadrant/pglogical (github.com) >> <https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/t/010_pglogical_create_subscriber.pl> >> Once you attain some reasonable sync state, you can drop the pglogical >> extension, and check if things continue fine. >> I have done something similar when upgrading from 9.6 to 11 using >> pglogical and then dropping the extension and it was smooth, >> maybe you need to try this out and share if things works fine. >> and >> The 1-2-3 for PostgreSQL Logical Replication Using an RDS Snapshot - >> Percona Database Performance Blog >> <https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/> >> >> > > -- > Regards, > Avinash Vallarapu (Avi) > CEO, > MigOps, Inc. >