On Wed, Feb 21, 2024, at 5:00 AM, Shlok Kyal wrote:
> I found some issues and fixed those issues with top up patches
> v23-0012 and v23-0013
> 1.
> Suppose there is a cascade physical replication node1->node2->node3.
> Now if we run pg_createsubscriber with node1 as primary and node2 as
> standby, pg_createsubscriber will be successful but the connection
> between node2 and node3 will not be retained and log og node3 will
> give error:
> 2024-02-20 12:32:12.340 IST [277664] FATAL:  database system
> identifier differs between the primary and standby
> 2024-02-20 12:32:12.340 IST [277664] DETAIL:  The primary's identifier
> is 7337575856950914038, the standby's identifier is
> 7337575783125171076.
> 2024-02-20 12:32:12.341 IST [277491] LOG:  waiting for WAL to become
> available at 0/3000F10
> 
> To fix this I am avoiding pg_createsubscriber to run if the standby
> node is primary to any other server.
> Made the change in v23-0012 patch

IIRC we already discussed the cascading replication scenario. Of course,
breaking a node is not good that's why you proposed v23-0012. However,
preventing pg_createsubscriber to run if there are standbys attached to it is
also annoying. If you don't access to these hosts you need to (a) kill
walsender (very fragile / unstable), (b) start with max_wal_senders = 0 or (3)
add a firewall rule to prevent that these hosts do not establish a connection
to the target server. I wouldn't like to include the patch as-is. IMO we need
at least one message explaining the situation to the user, I mean, add a hint
message.  I'm resistant to a new option but probably a --force option is an
answer. There is no test coverage for it. I adjusted this patch (didn't include
the --force option) and add a test case.

> 2.
> While checking 'max_replication_slots' in 'check_publisher' function,
> we are not considering the temporary slot in the check:
> +   if (max_repslots - cur_repslots < num_dbs)
> +   {
> +       pg_log_error("publisher requires %d replication slots, but
> only %d remain",
> +                    num_dbs, max_repslots - cur_repslots);
> +       pg_log_error_hint("Consider increasing max_replication_slots
> to at least %d.",
> +                         cur_repslots + num_dbs);
> +       return false;
> +   }
> Fixed this in v23-0013

Good catch!

Both are included in the next patch.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Reply via email to