Hackers,

Logical replication apply workers for a subscription can easily get stuck in an 
infinite loop of attempting to apply a change, triggering an error (such as a 
constraint violation), exiting with an error written to the subscription worker 
log, and restarting.

As things currently stand, only superusers can create subscriptions.  Ongoing 
work to delegate superuser tasks to non-superusers creates the potential for 
even more errors to be triggered, specifically, errors where the apply worker 
does not have permission to make changes to the target table.

The attached patch makes it possible to create a subscription using a new 
subscription_parameter, "disable_on_error", such that rather than going into an 
infinite loop, the apply worker will catch errors and automatically disable the 
subscription, breaking the loop.  The new parameter defaults to false.  When 
false, the PG_TRY/PG_CATCH overhead is avoided, so for subscriptions which do 
not use the feature, there shouldn't be any change.  Users can manually clear 
the error after fixing the underlying issue with an ALTER SUBSCRIPTION .. 
ENABLE command. 
 
In addition to helping on production systems, this makes writing TAP tests 
involving error conditions simpler.  I originally ran into the motivation to 
write this patch when frustrated that TAP tests needed to parse the apply 
worker log file to determine whether permission failures were occurring and 
what they were.  It was also obnoxiously easy to have a test get stuck waiting 
for a permanently stuck subscription to catch up.  This helps with both issues.

I don't think this is quite ready for commit, but I'd like feedback if folks 
like this idea or want to suggest design changes.

Attachment: v1-0001-Optionally-disabling-subscriptions-on-error.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Reply via email to