I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch.
Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. 2) I have no win32 version of the logic 3) Bind mounts either have to be created by the privileged pg_regress process or have to be pre-existing on the system #1 would not be as bad if pg_regress became pg_regress_replication, as we could make the mantra into 'sudo make replicationcheck' or similar. Splitting it from 'make check' also means IMHO that it could have heavier tests that take longer to run, since people merely interested in building and installing postgres would not be impacted by this. #2 might be fixed by someone more familiar with win32 programming than I am. #3 cannot be avoided as far as I can tell, but we could chose between the two options. So far, I have chosen to set up the directory structure and add the bind mount logic to my /etc/fstab only once, rather than having this get recreated every time I invoke 'sudo make check'. The community might prefer to go the other way, and have the directories and bind mounts get set up each invocation; I have avoided that thus far as I don't want 'sudo make check' (or 'sudo make replicationcheck') to abuse its raised privileges and muck with the filesystem in a way that could cause the user unexpected problems. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment 2) Checking out the sources from git and then running ./configure && make && sudo make replicationtest is not particularly difficult, assuming the directories and mounts are in place, or alternatively assuming that 'sudo make regressioncheck' creates them for you if they don't already exist. Comments and advice sincerely solicited, mark