Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 I thought the goal here was to have a testing framework that (a) is portable to every platform we support and (b) doesn't require root privileges to run. None of those options sound like they'll help meet those requirements. FWIW, I hacked up a Perl-based testing system as a proof of concept some time ago. I can dust it off if anyone is interested. Perl has a very nice testing ecosystem and is probably the most portable language we support, other than C. My quick goals for the project were: * allow granular testing (ala Andrew's recent email, which reminded me of this) * allow stackable methods and dependencies * make it very easy to write new tests * test various features that are way too diificult in our existing system (e.g. PITR, fdws) * get some automated code coverage metrics (this one was tricky) * allow future git integration based on subsytems - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201401261211 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlLlQeMACgkQvJuQZxSWSsiYhACggHJgQWB/Q2HEfjGZCwR3yEZg zMsAnAssOStAmMuaJEScCGHGKWYNow1v =zi0Y -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
Michael Paquier wrote: A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. I like making this part of src/test/isolation, if folks do not object. The core infrastructure in src/test/isolation seems applicable to replication testing, and I'd hate to duplicate that code. As for the node setup in your example above, I don't think it can be as simple as defining nodes first, then running tests. The configurations themselves may need to be changed during the execution of a test, and services stopped and started, all under test control and specified in the same easy format. I have started working on this, and will post WIP patches from time to time, unless you all feel the need to point me in a different direction. mark On Sunday, January 5, 2014 6:13 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it,
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On Thu, Jan 9, 2014 at 12:34 PM, Mark Dilger markdil...@yahoo.com wrote: Michael Paquier wrote: A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. I like making this part of src/test/isolation, if folks do not object. The core infrastructure in src/test/isolation seems applicable to replication testing, and I'd hate to duplicate that code. As for the node setup in your example above, I don't think it can be as simple as defining nodes first, then running tests. The configurations themselves may need to be changed during the execution of a test, and services stopped and started, all under test control and specified in the same easy format. Yes, my example was very basic :). What you actually need is the possibility to perform actions on nodes during a test run, basically: stop, start, init, reload, run SQL, change params/create new conf files (like putting a node in recovery could be = create recovery.conf + restart). The place of the code does not matter much, but don't think that it should be part of isolation as clustering and isolation are too different test suites. I would have for example seen that as src/test/cluster, with src/test/common for things that are shared between test infrastructures. As mentioned by Steve, the test suite of Slony might be interesting to look at to get some ideas. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 01/06/2014 07:12 PM, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. You can remove and recreate the symlink in pg_tblspc directory, after creating the cluster, to point it to a different location. It might be a bit tricky to do that if you have two clusters running at the same time, but it's probably easier than chrooting anyway. For example: 1. stop the standby 2. create the tablespace in master 3. stop master 4. mv the tablespace directory, and modify the symlink in master to point to the new location 5. start standby. It will replay the tablespace creation in the original location 6. restart master. You now have the same tablespace in master and standby, but they point to different locations. This doesn't allow dynamically creating and dropping tablespaces during tests, but at least it gives you one tablespace to use. Another idea would be to do something like chroot, but more lightweight, using FUSE, private mount namespaces, or cgroups. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
Heikki Linnakangas hlinnakan...@vmware.com writes: Another idea would be to do something like chroot, but more lightweight, using FUSE, private mount namespaces, or cgroups. I thought the goal here was to have a testing framework that (a) is portable to every platform we support and (b) doesn't require root privileges to run. None of those options sound like they'll help meet those requirements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 2014-01-07 10:27:14 -0500, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: Another idea would be to do something like chroot, but more lightweight, using FUSE, private mount namespaces, or cgroups. I thought the goal here was to have a testing framework that (a) is portable to every platform we support and (b) doesn't require root privileges to run. None of those options sound like they'll help meet those requirements. Seconded. Perhaps the solution is to simply introduce tablespaces located relative to PGDATA? That'd be fracking useful anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On Tuesday, January 7, 2014 7:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: Another idea would be to do something like chroot, but more lightweight, using FUSE, private mount namespaces, or cgroups. I thought the goal here was to have a testing framework that (a) is portable to every platform we support and (b) doesn't require root privileges to run. None of those options sound like they'll help meet those requirements. regards, tom lane If I drop the idea of sudo/chroot and punt for now on testing tablespaces under replication, it should be possible to test the rest of the replication system in a way that meets (a) and (b). Perhaps Andres' idea of tablespaces relative to the data directory will get implemented some day, at which point we wouldn't be punting quite so much. But until then, punt. Would it make sense for this to just be part of 'make check'? That would require creating multiple database clusters under multiple data directories, and having them bind to multiple ports or unix domain sockets. Is that a problem? What's the logic of having replication testing separated from the other pg_regress tests? Granted, not every user of postgres uses replication, but that's true for lots of features, and we don't split things like json into separate test suites. Vendors who run 'make check' as part of their packaging of postgresql would probably benefit from knowing if replication doesn't work on their distro, and they may not change their packaging systems to include a second 'make replicationcheck' step. mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
Mark Dilger markdil...@yahoo.com writes: Would it make sense for this to just be part of 'make check'? Probably not, as (I imagine) it will take quite a bit longer than make check does today. People who are not working on replication related features will be annoyed if a test cycle starts taking 10X longer than it used to, for tests of no value to them. It's already not the case that make check runs every available automated test; the isolation tests, the PL tests, the contrib tests are all separate. There is a make check-world, which I think should reasonably run all of these. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 01/05/2014 09:13 PM, Michael Paquier wrote: On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com mailto:markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. I'd encourage anyone looking at implementing a testing suite for replication to look at the stuff we did for Slony at least to get some ideas. We wrote a test driver framework (clustertest - https://github.com/clustertest/clustertest-framework) then some Javascript base classes for common types of operations. An individual test is then written in Javascript that invokes methods either in the framework or base-class to do most of the interesting work. http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blob;f=clustertest/disorder/tests/EmptySet.js;h=7b4850c1d24036067f5a659b990c7f05415ed967;hb=HEAD as an example A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of the logic For a first shot I am not sure that it matters much. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment An assumption of the test
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. So: (the '-' means is bind mounted to) /master/bin - /bin /master/dev - /dev /master/etc - /etc /master/lib - /lib /master/usr - /usr /master/data /master/tablespace /hotstandby/bin - /bin /hotstandby/dev - /dev /hotstandby/etc - /etc /hotstandby/lib - /lib /hotstandby/usr - /usr /hotstandby/data /hotstandby/tablespace So from inside the master chroot, you see the system's /bin as /bin, the system's /dev as /dev, etc, but what you see as /data and /tablespace are your own private ones. Likewise from the hotstandby chroot. But since the binaries are in something like /home/myuser/postgresql/src/test/regress/tmp_check/install/usr/local/pgsql/bin each cluster uses the same binaries, refered to by the same path. On Sunday, January 5, 2014 5:25 PM, Greg Stark st...@mit.edu wrote: -- greg On 5 Jan 2014 14:54, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. This sounds pretty cool. The real trick will be in testing concurrent behaviour -- I.e. queries on the slave when it's replaying logs at a certain point. But right now we have nothing so anything would be an improvement. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. This isn't necessary. You can use the same binaries and run initdb with a different location just fine. Then start up the database with -D to specify the directory.
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
I agree that merely setting up masters and slaves is the tip of the iceberg. It seems to be what needs to be tackled first, though, because until we have a common framework, we cannot all contribute tests to it. I imagine setting up a whole hierarchy of master, hotstandbys, warmstandbys, etc., and having over the course of the test, base backups made, new clusters spun up from those backups, masters stopped and standbys promoted to master, etc. But I also imagine there needs to be SQL run on the master that changes the data, so that replication of those changes can be confirmed. There are lots of ways to change data, such as through the large object interface. The current 'make check' test suite exercises all those code paths. If we incorporate them into our replication testing suite, then we get the advantage of knowing that all those paths are being tested in our suite as well. And if some new interface, call it huge object, ever gets made, then there should be a hugeobject.sql in src/test/regress/sql, and we automatically get that in our replication tests. mark On Sunday, January 5, 2014 6:13 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 2014-01-06 09:12:03 -0800, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. I don't think the benefit of being able to test tablespaces without restarts comes even close to offsetting the cost of requiring sudo permissions and introducing OS dependencies. E.g. there's pretty much no hope of making this work sensibly on windows. So I'd just leave out that part. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. 2) I have no win32 version of the logic 3) Bind mounts either have to be created by the privileged pg_regress process or have to be pre-existing on the system #1 would not be as bad if pg_regress became pg_regress_replication, as we could make the mantra into 'sudo make replicationcheck' or similar. Splitting it from 'make check' also means IMHO that it could have heavier tests that take longer to run, since people merely interested in building and installing postgres would not be impacted by this. #2 might be fixed by someone more familiar with win32 programming than I am. #3 cannot be avoided as far as I can tell, but we could chose between the two options. So far, I have chosen to set up the directory structure and add the bind mount logic to my /etc/fstab only once, rather than having this get recreated every time I invoke 'sudo make check'. The community might prefer to go the other way, and have the directories and bind mounts get set up each invocation; I have avoided that thus far as I don't want 'sudo make check' (or 'sudo make replicationcheck') to abuse its raised privileges and muck with the filesystem in a way that could cause the user unexpected problems. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment 2) Checking out the sources from git and then running ./configure make sudo make replicationtest is not particularly difficult, assuming the directories and mounts are in place, or alternatively assuming that 'sudo make regressioncheck' creates them for you if they don't already exist. Comments and advice sincerely solicited, mark
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of the logic For a first shot I am not sure that it matters much. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment An assumption of the test suite is I think to allow developers to check for bugs on a local server only. This facilitates how the test suite is written and you don't need to enter in things like VM settings or cross-environment tests, things that could be done already nicely by frameworks of the type Jenkins. What I think people would like to have is that: cd src/test/replication make check/installcheck And have the test run for them. Regards, -- Michael