I agree that merely setting up masters and slaves is
the tip of the iceberg.  It seems to be what needs
to be tackled first, though, because until we have
a common framework, we cannot all contribute
tests to it.

I imagine setting up a whole hierarchy of master,
hotstandbys, warmstandbys, etc., and having
over the course of the test, base backups made,
new clusters spun up from those backups,
masters stopped and standbys promoted to
master, etc.

But I also imagine there needs to be SQL run
on the master that changes the data, so that
replication of those changes can be confirmed.
There are lots of ways to change data, such as
through the large object interface.  The current
'make check' test suite exercises all those
code paths.  If we incorporate them into our
replication testing suite, then we get the
advantage of knowing that all those paths are
being tested in our suite as well.  And if some
new interface, call it huge object, ever gets
made, then there should be a hugeobject.sql
in src/test/regress/sql, and we automatically
get that in our replication tests.


On Sunday, January 5, 2014 6:13 PM, Michael Paquier <michael.paqu...@gmail.com> 

On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger <markdil...@yahoo.com> wrote:
> I am building a regression test system for replication and came across
> this email thread.  I have gotten pretty far into my implementation, but
> would be happy to make modifications if folks have improvements to
> suggest.  If the community likes my design, or a modified version based
> on your feedback, I'd be happy to submit a patch.
Yeah, this would be nice to look at, core code definitely needs to have some 
more infrastructure for such a test suite. I didn't get the time to go back to 
it since I began this thread though :)

> Currently I am canibalizing src/test/pg_regress.c, but that could instead
> be copied to src/test/pg_regress_replication.c or whatever.  The regression
> test creates and configures multiple database clusters, sets up the
> replication configuration for them, runs them each in nonprivileged mode
> and bound to different ports, feeds all the existing 141 regression tests
> into the master database with the usual checking that all the right results
> are obtained, and then checks that the standbys have the expected
> data.  This is possible all on one system because the database clusters
> are chroot'ed to see their own /data directory and not the /data directory
> of the other chroot'ed clusters, although the rest of the system, like /bin
> and /etc and /dev are all bind mounted and visible to each cluster.
Having vanilla regressions run in a cluster with multiple nodes and check the 
results on a standby is the top of the iceberg though. What I had in mind when 
I began this thread was to have more than a copy/paste of pg_regress, but an 
infrastructure that people could use to create and customize tests by having an 
additional control layer on the cluster itself. For example, testing 
replication is not only a matter of creating and setting up the nodes, but you 
might want to be able to initialize, add, remove nodes during the tests. Node 
addition would be either a new fresh master (this would be damn useful for a 
test suite for logical replication I think), or a slave node with custom 
recovery parameters to test replication, as well as PITR, archiving, etc. Then 
you need to be able to run SQL commands on top of that to check if the results 
are consistent with what you want.

A possible input for a test that users could provide would be something like 
# Node information for tests
    {node1, postgresql.conf params, recovery.conf params}
    {node2, postgresql.conf params, recovery.conf params, slave of node1}
# Run test
init node1
run_sql node1 file1.sql
# Check output
init node2
run_sql node2 file2.sql
# Check that results are fine
# Process

The main problem is actually how to do that. Having some smart shell 
infrastructure would be simple and would facilitate (?) the maintenance of code 
used to run the tests. On the contrary having a C program would make the 
maintenance of code to run the tests more difficult (?) for a trade with more 
readable test suite input like the one I wrote above. This might also make the 
test input more readable for a human eye, in the shape of what is already 
available in src/test/isolation.

Another possibility could be also to integrate directly a recovery/backup 
manager in PG core, and have some tests for it, or even include those tests 
directly with pg_basebackup or an upper layer of it.

> There of course is room to add as many replication tests as you like,
> and the main 141 tests fed into the master could be extended to feed
> more data and such.
> The main drawbacks that I don't care for are:
> 1) 'make check' becomes 'sudo make check' because it needs permission
> to run chroot.

-1 for that developers should not need to use root to run regression suite.

> 2) I have no win32 version of the logic

For a first shot I am not sure that it matters much.

> The main advantages that I like about this design are:
> 1) Only one system is required.  The developer does not need network
> access to a second replication system.  Moreover, multiple database
> clusters can be established with interesting replication hierarchies between
> them, and the cost of each additional cluster is just another chroot
> environment

An assumption of the test suite is I think to allow developers to check for 
bugs on a local server only. This facilitates how the test suite is written and 
you don't need to enter in things like VM settings or cross-environment tests, 
things that could be done already nicely by frameworks of the type Jenkins. 
What I think people would like to have is that:

cd src/test/replication && make check/installcheck

And have the test run for them.



Reply via email to