On Mon, Apr 29, 2024 at 5:28 PM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Monday, April 29, 2024 5:11 PM shveta malik <shveta.ma...@gmail.com> wrote: > > > > On Mon, Apr 29, 2024 at 11:38 AM shveta malik <shveta.ma...@gmail.com> > > wrote: > > > > > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > > > <houzj.f...@fujitsu.com> wrote: > > > > > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > > <bertranddrouvot...@gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > On Thu, Mar 14, 2024 at 02:22:44AM +0000, Zhijie Hou (Fujitsu) wrote: > > > > > > Hi, > > > > > > > > > > > > Since the standby_slot_names patch has been committed, I am > > > > > > attaching the last doc patch for review. > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > 1 === > > > > > > > > > > + continue subscribing to publications now on the new primary > > > > > + server > > > > > without > > > > > + any data loss. > > > > > > > > > > I think "without any data loss" should be re-worded in this > > > > > context. Data loss in the sense "data committed on the primary and > > > > > not visible on the subscriber in case of failover" can still occurs > > > > > (in case > > synchronous replication is not used). > > > > > > > > > > 2 === > > > > > > > > > > + If the result (<literal>failover_ready</literal>) of both above > > > > > steps is > > > > > + true, existing subscriptions will be able to continue without data > > loss. > > > > > + </para> > > > > > > > > > > I don't think that's true if synchronous replication is not used. > > > > > Say, > > > > > > > > > > - synchronous replication is not used > > > > > - primary is not able to reach the standby anymore and > > > > > standby_slot_names is set > > > > > - new data is inserted into the primary > > > > > - then not replicated to subscriber (due to standby_slot_names) > > > > > > > > > > Then I think the both above steps will return true but data would > > > > > be lost in case of failover. > > > > > > > > Thanks for the comments, attach the new version patch which reworded > > > > the above places. > > > > > > Thanks for the patch. > > > > > > Few comments: > > > > > > 1) Tested the steps, one of the queries still refers to > > > 'conflict_reason'. I think it should refer 'conflicting'. > > Thanks for catching this. Fixed. > > > > > > > 2) Will it be good to mention that in case of planned promotion, it is > > > recommended to run pg_sync_replication_slots() as last sync attempt > > > before we run failvoer-ready validation steps? This can be mentioned > > > in high-availaibility.sgml of current patch > > > > I recall now that with the latest fix, we cannot run > > pg_sync_replication_slots() unless we disable the slot-sync worker. > > Considering that, I think it will be too many steps just to run the SQL > > function at > > the end without much value added. Thus we can skip this point, we can rely > > on > > slot sync worker completely. > > Agreed. I didn't change this. > > Here is the V3 doc patch.
Thanks for the patch. It will be good if 1a can produce quoted slot-names list as output, which can be used directly in step 1b's query; otherwise, it is little inconvenient to give input to 1b if the number of slots are huge. User needs to manually quote each slot-name. Other than this, the patch looks good to me. thanks Shveta