On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger <markdil...@yahoo.com> wrote:
> I am building a regression test system for replication and came across
> this email thread.  I have gotten pretty far into my implementation, but
> would be happy to make modifications if folks have improvements to
> suggest.  If the community likes my design, or a modified version based
> on your feedback, I'd be happy to submit a patch.
Yeah, this would be nice to look at, core code definitely needs to have
some more infrastructure for such a test suite. I didn't get the time to go
back to it since I began this thread though :)

> Currently I am canibalizing src/test/pg_regress.c, but that could instead
> be copied to src/test/pg_regress_replication.c or whatever.  The
> test creates and configures multiple database clusters, sets up the
> replication configuration for them, runs them each in nonprivileged mode
> and bound to different ports, feeds all the existing 141 regression tests
> into the master database with the usual checking that all the right
> are obtained, and then checks that the standbys have the expected
> data.  This is possible all on one system because the database clusters
> are chroot'ed to see their own /data directory and not the /data directory
> of the other chroot'ed clusters, although the rest of the system, like
> and /etc and /dev are all bind mounted and visible to each cluster.
Having vanilla regressions run in a cluster with multiple nodes and check
the results on a standby is the top of the iceberg though. What I had in
mind when I began this thread was to have more than a copy/paste of
pg_regress, but an infrastructure that people could use to create and
customize tests by having an additional control layer on the cluster
itself. For example, testing replication is not only a matter of creating
and setting up the nodes, but you might want to be able to initialize, add,
remove nodes during the tests. Node addition would be either a new fresh
master (this would be damn useful for a test suite for logical replication
I think), or a slave node with custom recovery parameters to test
replication, as well as PITR, archiving, etc. Then you need to be able to
run SQL commands on top of that to check if the results are consistent with
what you want.

A possible input for a test that users could provide would be something
like that:
# Node information for tests
    {node1, postgresql.conf params, recovery.conf params}
    {node2, postgresql.conf params, recovery.conf params, slave of node1}
# Run test
init node1
run_sql node1 file1.sql
# Check output
init node2
run_sql node2 file2.sql
# Check that results are fine
# Process

The main problem is actually how to do that. Having some smart shell
infrastructure would be simple and would facilitate (?) the maintenance of
code used to run the tests. On the contrary having a C program would make
the maintenance of code to run the tests more difficult (?) for a trade
with more readable test suite input like the one I wrote above. This might
also make the test input more readable for a human eye, in the shape of
what is already available in src/test/isolation.

Another possibility could be also to integrate directly a recovery/backup
manager in PG core, and have some tests for it, or even include those tests
directly with pg_basebackup or an upper layer of it.

> There of course is room to add as many replication tests as you like,
> and the main 141 tests fed into the master could be extended to feed
> more data and such.
> The main drawbacks that I don't care for are:
> 1) 'make check' becomes 'sudo make check' because it needs permission
> to run chroot.
-1 for that developers should not need to use root to run regression suite.

> 2) I have no win32 version of the logic
For a first shot I am not sure that it matters much.

> The main advantages that I like about this design are:
> 1) Only one system is required.  The developer does not need network
> access to a second replication system.  Moreover, multiple database
> clusters can be established with interesting replication hierarchies
> them, and the cost of each additional cluster is just another chroot
> environment
An assumption of the test suite is I think to allow developers to check for
bugs on a local server only. This facilitates how the test suite is written
and you don't need to enter in things like VM settings or cross-environment
tests, things that could be done already nicely by frameworks of the type
Jenkins. What I think people would like to have is that:
cd src/test/replication && make check/installcheck
And have the test run for them.


Reply via email to