The attached proposed patch changes the TAP test infrastructure's
poll_query_until function in two ways:
1. An optional argument is added to allow specifying the query result
value we're waiting for, overriding the normal "t". This allows
folding a handwritten delay loop in 007_sync_rep.pl into the
poll_query_until ecosystem. As far as I've found, there are no other
handwritten delay loops in the TAP tests.
2. poll_query_until is modified to probe 10X per second not once
per second, in keeping with the changes I've been making elsewhere
to remove not-carefully-analyzed 1s delays in the regression tests.
On my workstation, the reduced probe delay shaves a useful amount
of time off the recovery and subscription regression tests. I also
tried it on dromedary, which is about the slowest hardware I'd care
to run the TAP tests on regularly, and it seems to be about a wash
there --- some tests get faster, but some get slower, presumably due
to more overhead from the probe queries.
I notice that buildfarm member skink (which runs with valgrind)
recently failed a test run due to poll_query_until timing out:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2017-06-30%2000%3A50%3A01
I'm inclined to respond to that either by increasing the fixed
180-second timeout, or by making it configurable from an environment
variable (which Andres would then have to add to skink's configuration).
Thoughts?
regards, tom lane
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index f4fa600..4346423 100644
*** a/src/test/perl/PostgresNode.pm
--- b/src/test/perl/PostgresNode.pm
*** sub psql
*** 1213,1248
=pod
! =item $node->poll_query_until(dbname, query)
! Run a query once a second, until it returns 't' (i.e. SQL boolean true).
! Continues polling if psql returns an error result. Times out after 180 seconds.
=cut
sub poll_query_until
{
! my ($self, $dbname, $query) = @_;
! my $max_attempts = 180;
! my $attempts = 0;
my ($stdout, $stderr);
while ($attempts < $max_attempts)
{
- my $cmd =
- [ 'psql', '-XAt', '-c', $query, '-d', $self->connstr($dbname) ];
my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
chomp($stdout);
$stdout =~ s/\r//g if $TestLib::windows_os;
! if ($stdout eq "t")
{
return 1;
}
! # Wait a second before retrying.
! sleep 1;
$attempts++;
}
--- 1213,1255
=pod
! =item $node->poll_query_until($dbname, $query [, $expected ])
! Run B<$query> repeatedly, until it returns the B<$expected> result
! ('t', or SQL boolean true, by default).
! Continues polling if B returns an error result.
! Times out after 180 seconds.
! Returns 1 if successful, 0 if timed out.
=cut
sub poll_query_until
{
! my ($self, $dbname, $query, $expected) = @_;
! $expected = 't' unless defined($expected); # default value
!
! my $cmd =
! [ 'psql', '-XAt', '-c', $query, '-d', $self->connstr($dbname) ];
my ($stdout, $stderr);
+ my $max_attempts = 180 * 10;
+ my $attempts = 0;
while ($attempts < $max_attempts)
{
my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
chomp($stdout);
$stdout =~ s/\r//g if $TestLib::windows_os;
!
! if ($stdout eq $expected)
{
return 1;
}
! # Wait 0.1 second before retrying.
! select undef, undef, undef, 0.1;
!
$attempts++;
}
diff --git a/src/test/recovery/t/007_sync_rep.pl b/src/test/recovery/t/007_sync_rep.pl
index 8e3cc5e..0f999f0 100644
*** a/src/test/recovery/t/007_sync_rep.pl
--- b/src/test/recovery/t/007_sync_rep.pl
*** use Test::More tests => 11;
*** 9,15
my $check_sql =
"SELECT application_name, sync_priority, sync_state FROM pg_stat_replication ORDER BY application_name;";
! # Check that sync_state of each standby is expected.
# If $setting is given, synchronous_standby_names is set to it and
# the configuration file is reloaded before the test.
sub test_sync_state
--- 9,15
my $check_sql =
"SELECT application_name, sync_priority, sync_state FROM pg_stat_replication ORDER BY application_name;";
! # Check that sync_state of each standby is expected (waiting till it is).
# If $setting is given, synchronous_standby_names is set to it and
# the configuration file is reloaded before the test.
sub test_sync_state
*** sub test_sync_state
*** 23,46
$self->reload;
}
! my $timeout_max = 30;
! my $timeout = 0;
! my $result;
!
! # A reload may take some time to take effect on busy machines,
! # hence use a loop with a timeout to give some room for the test
! # to pass.
! while ($timeout < $timeout_max)
! {
! $result = $self->safe_psql('postgres', $check_sql);
!
! last if ($result eq $expected);
!
! $timeout++;
! sleep 1;
! }
!
! is($result, $expected, $msg);
}
# Initialize master node
--- 23,29
$self->rel