On 04/11/2020 11:23, Heikki Linnakangas wrote:
I read through the patches one more time, fixed a bunch of typos and such, and pushed patches 1-4. I'm going to spend some more time on testing the last patch. It allows using a standby server as the source, and we don't have any tests for that yet. Thanks for the review!
Did some more testing, fixed one bug, and pushed.To test this, I set up a cluster with one primary, a standby, and a cascaded standby. I launched a test workload against the primary that creates tables, inserts rows, and drops tables continuously. In another shell, I promoted the cascaded standby, run some updates on the promoted server, and finally, run pg_rewind pointed at the standby, and start it again as a cascaded standby. Repeat.
Attached are the scripts I used. I edited them between test runs to test slightly different scenarios. I don't expect them to be very useful to anyone else, but the Internet is my backup.
I did find one bug in the patch with that, so the time was well spent: the code in process_queued_fetch_requests() got confused and errored out, if a file was removed in the source system while pg_rewind was running. There was code to deal with that, but it was broken. Fixed that.
- Heikki
rewind-cascading-test.tar.gz
Description: application/gzip