Hi,
On Friday, February 3, 2023 3:35 PM I wrote: > On Friday, February 3, 2023 2:21 PM Amit Kapila <amit.kapil...@gmail.com> > wrote: > > On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2...@gmail.com> > wrote: > > > On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu) > > > <osumi.takami...@fujitsu.com> wrote: > > > ... > > > > > Besides, I am not sure it's a stable test to check the log. Is > > > > > it possible that there's no such log on a slow machine? I > > > > > modified the code to sleep 1s at the beginning of > > > > > apply_dispatch(), then the new added test failed because the server > log cannot match. > > > > To get the log by itself is necessary to ensure that the delay is > > > > conducted by the apply worker, because we emit the diffms only if > > > > it's bigger than 0 in maybe_apply_delay(). If we omit the step, we > > > > are not sure the delay is caused by other reasons or the > > > > time-delayed > > feature. > > > > > > > > As you mentioned, it's possible that no log is emitted on slow > > > > machine. Then, the idea to make the test safer for such machines > > > > should > > be to make the delayed time longer. > > > > But we shortened the delay time to 1 second to mitigate the long > > > > test > > execution time of this TAP test. > > > > So, I'm not sure if it's a good idea to make it longer again. > > > > > > I think there are a couple of things that can be done about this problem: > > > > > > 1. If you need the code/test to remain as-is then at least the test > > > message could include some comforting text like "(this can fail on > > > slow machines when the delay time is already exceeded)" so then a > > > test failure will not cause undue alarm. > > > > > > 2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so > > > that it will *always* log the remaining wait time even if that wait > > > time becomes negative. Then I think the test cases can be made > > > deterministic instead of relying on good luck. This seems like the > > > better option. > > > > > > > I don't understand why we have to do any of this instead of using 3s > > as min_apply_delay similar to what we are doing in > > src/test/recovery/t/005_replay_delay. Also, I think we should use > > exactly the same way to verify the test even though we want to keep > > the log level as > > DEBUG2 to check logs in case of any failures. > OK, will try to make our tests similar to the tests in 005_replay_delay as > much > as possible. I've updated the TAP test and made it aligned with 005_reply_delay.pl. For coverage, I have the stream of in-progress transaction test case and ALTER SUBSCRIPTION DISABLE behavior, which is unique to logical replication. Also, conducted pgindent and pgperltidy. Note that the latter half of the 005_reply_delay.pl doesn't seem to match with the test for time-delayed logical replication (e.g. promotion). So, I don't have those points. Kindly have a look at the attached v27. Best Regards, Takamichi Osumi
v27-0001-Time-delayed-logical-replication-subscriber.patch
Description: v27-0001-Time-delayed-logical-replication-subscriber.patch