Here are some review comments for v20-0004: (This completes my reviews of the v20* patch set. Sorry, the reviews are time consuming, so I am lagging slightly behind the latest posted version)
====== 1. doc/src/sgml/ref/create_subscription.sgml @@ -245,6 +245,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl also be the unique column on the publisher-side; 2) there cannot be any non-immutable functions used by the subscriber-side replicated table. + When applying a streaming transaction, if either requirement is not + met, the background worker will exit with an error. + The <literal>parallel</literal> mode is disregarded when retrying; + instead the transaction will be applied using <literal>on</literal> + mode. </para> The "on mode" still sounds strange to me. Maybe it's just my personal opinion, but I don’t really consider 'on' and 'off' to be "modes". Anyway I already posted the same comment several times before [1, #4.3]. Let's see what others think. SUGGESTION "using on mode" -> "using streaming = on" ====== 2. src/backend/replication/logical/worker.c - start_table_sync @@ -3902,20 +3925,28 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname) } PG_CATCH(); { + /* + * Emit the error message, and recover from the error state to an idle + * state + */ + HOLD_INTERRUPTS(); + + EmitErrorReport(); + AbortOutOfAnyTransaction(); + FlushErrorState(); + + RESUME_INTERRUPTS(); + + /* Report the worker failed during table synchronization */ + pgstat_report_subscription_error(MySubscription->oid, false); + if (MySubscription->disableonerr) - DisableSubscriptionAndExit(); - else - { - /* - * Report the worker failed during table synchronization. Abort - * the current transaction so that the stats message is sent in an - * idle state. - */ - AbortOutOfAnyTransaction(); - pgstat_report_subscription_error(MySubscription->oid, false); + DisableSubscriptionOnError(); - PG_RE_THROW(); - } + /* Set the retry flag. */ + set_subscription_retry(true); + + proc_exit(0); } PG_END_TRY(); Perhaps current code is OK, but I am not 100% sure if we should set the retry flag when the disable_on_error is set, because the subscription is not going to be retried (because it is disabled). And later, if/when the user does enable the subscription, presumably that will be after they have already addressed the problem that caused the error/disablement in the first place. ~~~ 3. src/backend/replication/logical/worker.c - start_apply PG_CATCH(); { + /* + * Emit the error message, and recover from the error state to an idle + * state + */ + HOLD_INTERRUPTS(); + + EmitErrorReport(); + AbortOutOfAnyTransaction(); + FlushErrorState(); + + RESUME_INTERRUPTS(); + + /* Report the worker failed while applying changes */ + pgstat_report_subscription_error(MySubscription->oid, + !am_tablesync_worker()); + if (MySubscription->disableonerr) - DisableSubscriptionAndExit(); - else - { - /* - * Report the worker failed while applying changes. Abort the - * current transaction so that the stats message is sent in an - * idle state. - */ - AbortOutOfAnyTransaction(); - pgstat_report_subscription_error(MySubscription->oid, !am_tablesync_worker()); + DisableSubscriptionOnError(); - PG_RE_THROW(); - } + /* Set the retry flag. */ + set_subscription_retry(true); } PG_END_TRY(); } 3a. Same comment as #2 3b. This PG_CATCH used to leave by either proc_exit(0) or PG_RE_THROW but what does it do now? My first impression is there is a bug here due to some missing code, because AFAICT the exception is caught and gobbled up and then what...? ~~~ 4. src/backend/replication/logical/worker.c - set_subscription_retry + if (MySubscription->retry == retry || + am_apply_bgworker()) + return; 4a. I this this quick exit can be split and given some appropriate comments SUGGESTION (for example) /* Fast path - if no state change then nothing to do */ if (MySubscription->retry == retry) return; /* Fast path - skip for apply background workers */ if (am_apply_bgworker()) return; ====== 5. .../subscription/t/032_streaming_apply.pl @@ -78,9 +78,13 @@ my $timer = IPC::Run::timeout($PostgreSQL::Test::Utils::timeout_default); my $h = $node_publisher->background_psql('postgres', \$in, \$out, $timer, on_error_stop => 0); +# ============================================================================ All those comment highlighting lines like "# ==============" really belong in the earlier patch (0003 ?) when this TAP test file was introduced. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPvrw%2BtgCEYGxv%2BnKrqg-zbJdYEXee6o4irPAsYoXcuUcw%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia