Re: [HACKERS] pg_upgrade and rsync
On Fri, Mar 6, 2015 at 12:19:36PM -0500, Bruce Momjian wrote: On Fri, Mar 6, 2015 at 10:50:27AM +0300, Vladimir Borodin wrote: What I have done is to update the pg_upgrade instructions to add this required step. Updated doc patch attached. (I also added the --delete flag to rsync.) Thanks so much for your detailed report. It seems to work fine now. The only thing that would be nice to change is to explicitly write that these instructions are correct for hot standby installations too. + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para Actually, I’ve entered this thread because it is not obvious from the paragraph above or any other places. Oh, very good point. I was trying to match the wording we use in the docs, but forgot that log shipping and streaming replication are specified separately. Updated patch attached. You can view the output at: http://momjian.us/tmp/pgsql/pgupgrade.html Thanks much! Patch applied to head. While it will work all the way back to 9.0, the instructions need a shaking out during beta. I will blog that people can try it. Thanks to Stephen Frost for coming up with this solution. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Mar 6, 2015 at 10:50:27AM +0300, Vladimir Borodin wrote: What I have done is to update the pg_upgrade instructions to add this required step. Updated doc patch attached. (I also added the --delete flag to rsync.) Thanks so much for your detailed report. It seems to work fine now. The only thing that would be nice to change is to explicitly write that these instructions are correct for hot standby installations too. + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para Actually, I’ve entered this thread because it is not obvious from the paragraph above or any other places. Oh, very good point. I was trying to match the wording we use in the docs, but forgot that log shipping and streaming replication are specified separately. Updated patch attached. You can view the output at: http://momjian.us/tmp/pgsql/pgupgrade.html Thanks much! -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server just long enough to do a second applicationrsync/. The !second applicationrsync/ will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server long enough to do an commandrsync --checksum/. !(option--checksum/ is necessary because commandrsync/ only !has file modification-time granularity of one second.) The !second applicationrsync/ will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..0d79fb5 *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** NET STOP postgresql-8.4 *** 315,320 --- 315,325 NET STOP postgresql-9.0 /programlisting /para + + para + Streaming replication and log-shipping standby servers can remain running until + a later step. + /para /step step *** pg_upgrade.exe *** 399,404 --- 404,539 /step step + titleUpgrade Streaming Replication and Log-Shipping standby + servers/title + + para + If you have Streaming Replication (xref + linkend=streaming-replication) or Log-Shipping (xref + linkend=warm-standby) standby servers, follow these steps to + upgrade them (before starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed on all +standby servers. + /para + /step + + step + titleMake sure the new standby data directories do emphasisnot/ + exist/title + + para +Make sure the new standby data directories do emphasisnot/ +exist or are empty. If applicationinitdb/ was run, delete +the standby server data directories. + /para + /step + + step + titleInstall custom shared object files/title + + para +Install the same custom shared object files on the new standbys +that you installed in the new master cluster. + /para + /step + + step + titleStop standby servers/title + + para +If the standby servers are still running, stop them now using the +above instructions. + /para + /step + + step + titleVerify standby servers/title + + para +To prevent old standby servers from being modified, run +applicationpg_controldata/ against the primary and standby +clusters and verify that the quoteLatest checkpoint location/ +values match in all clusters. (This requires the standbys to be +shut down after the
Re: [HACKERS] pg_upgrade and rsync
On Thu, Mar 5, 2015 at 10:55:28AM +0300, Vladimir Borodin wrote: You are correct that a pg_controldata file is copied over that has wal_level=minimal, but that should not be a problem. I suppose, this is the root cause of why replica does not start as hot standby. It it enough to start it as warm standby, but not hot standby. See CheckRequiredParameterValues function in xlog.c which is called inside of StartupXLOG function. Yes, you are correct. I spent all day building a test harness so I could automate this setup and test various failures. I was able to reproduce your failure, and you are correct that the proper fix is to set wal_level=hot_standby on the new master, and then start and stop the new cluster just before rsync. The root cause is that pg_upgrade calls pg_resetxlog -o on the new cluster _after_ the new cluster stopped for the final time, so rsync is copying the incorrect pg_controldata wal_level value. Also, even if pg_resetxlog preserved wal_level in the control file, there is no guarantee that the user configured the new cluster's wal_level for hot_standby anyway. What I have done is to update the pg_upgrade instructions to add this required step. Updated doc patch attached. (I also added the --delete flag to rsync.) Thanks so much for your detailed report. But it could not be done with --size-only key, because control-file is of fixed size and rsync would skip it. Or may be everything should be copied with --size-only and control-file should be copied without this option. Well, what happens is that there is no _new_ standby pg_controldata file, so it is copied fully from the new master. Are you running initdb to create the new standby --- you shouldn't be doing that as the rsync will do that for you. No, I don’t. The scenario of the problem with copying control-file was in case when I: 1. ran pg_upgrade on master and got control-file with wal_level = minimal, 2. did rsync --size-only to replica (and it got this control-file with wal_level = minimal), 3. started and stopped postgres on master to get «good» control-file with wal_level = hot_standby», 4. did rsync --size-only to replica one more time. And this time control-file is not copied because of the same size of control-file. Actually, if you don’t do step 2, everything works as expected. Sorry for bothering you. Ah, yes, I think doing rsync twice is never a good suggestion. It can lead to too many failures. Doing the start/stop before rsync seems like the best solution. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server just long enough to do a second applicationrsync/. The !second applicationrsync/ will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server long enough to do an commandrsync --checksum/. !(option--checksum/ is necessary because commandrsync/ only !has file modification-time granularity of one second.) The !second applicationrsync/ will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..56cb45d *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** NET STOP postgresql-8.4 *** 315,320 --- 315,324 NET STOP postgresql-9.0 /programlisting /para + + para + Log-shipping standby servers can remain running until a later step. + /para /step step *** pg_upgrade.exe *** 399,404 --- 403,535 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before +
Re: [HACKERS] pg_upgrade and rsync
6 марта 2015 г., в 6:11, Bruce Momjian br...@momjian.us написал(а): On Thu, Mar 5, 2015 at 10:55:28AM +0300, Vladimir Borodin wrote: You are correct that a pg_controldata file is copied over that has wal_level=minimal, but that should not be a problem. I suppose, this is the root cause of why replica does not start as hot standby. It it enough to start it as warm standby, but not hot standby. See CheckRequiredParameterValues function in xlog.c which is called inside of StartupXLOG function. Yes, you are correct. I spent all day building a test harness so I could automate this setup and test various failures. I was able to reproduce your failure, and you are correct that the proper fix is to set wal_level=hot_standby on the new master, and then start and stop the new cluster just before rsync. The root cause is that pg_upgrade calls pg_resetxlog -o on the new cluster _after_ the new cluster stopped for the final time, so rsync is copying the incorrect pg_controldata wal_level value. Also, even if pg_resetxlog preserved wal_level in the control file, there is no guarantee that the user configured the new cluster's wal_level for hot_standby anyway. What I have done is to update the pg_upgrade instructions to add this required step. Updated doc patch attached. (I also added the --delete flag to rsync.) Thanks so much for your detailed report. It seems to work fine now. The only thing that would be nice to change is to explicitly write that these instructions are correct for hot standby installations too. + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para Actually, I’ve entered this thread because it is not obvious from the paragraph above or any other places. But it could not be done with --size-only key, because control-file is of fixed size and rsync would skip it. Or may be everything should be copied with --size-only and control-file should be copied without this option. Well, what happens is that there is no _new_ standby pg_controldata file, so it is copied fully from the new master. Are you running initdb to create the new standby --- you shouldn't be doing that as the rsync will do that for you. No, I don’t. The scenario of the problem with copying control-file was in case when I: 1. ran pg_upgrade on master and got control-file with wal_level = minimal, 2. did rsync --size-only to replica (and it got this control-file with wal_level = minimal), 3. started and stopped postgres on master to get «good» control-file with wal_level = hot_standby», 4. did rsync --size-only to replica one more time. And this time control-file is not copied because of the same size of control-file. Actually, if you don’t do step 2, everything works as expected. Sorry for bothering you. Ah, yes, I think doing rsync twice is never a good suggestion. It can lead to too many failures. Doing the start/stop before rsync seems like the best solution. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + rsync.diff -- May the force be with you… https://simply.name
Re: [HACKERS] pg_upgrade and rsync
On Wed, Mar 4, 2015 at 01:53:47PM +0300, Vladimir Borodin wrote: After running initdb to create the new master, but before running pg_upgrade, modify the new master's postgresql.conf and change wal_level = hot_standby. (Don't modify pg_hba.conf at this stage.) That does not help. The reason is that pg_upgrade sets 'Current wal_level setting: minimal' in control-file, and it does not depend on what is set in postgresql.conf before running pg_upgrade. Details could be seen here - http:// pastie.org/9998671. Well, what is happening is that the pg_resetxlog commands we run inside pg_upgrade set the pg_controldata file's wal_level to minimal, but as you saw, starting the server sets the pg_controldata properly. pg_resetxlog is not changing the WAL files at all, just the control file. The workaround for this is to start and cleanly shut down postgres on master after running pg_upgrade but before running rsync. After that there would be a good control-file for streaming replication and rsync to replica can be done. You are correct that a pg_controldata file is copied over that has wal_level=minimal, but that should not be a problem. But it could not be done with --size-only key, because control-file is of fixed size and rsync would skip it. Or may be everything should be copied with --size-only and control-file should be copied without this option. Well, what happens is that there is no _new_ standby pg_controldata file, so it is copied fully from the new master. Are you running initdb to create the new standby --- you shouldn't be doing that as the rsync will do that for you. Also, are you cleanly shutting down all the servers, or using pg_ctl -m immediate? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
4 марта 2015 г., в 19:28, Bruce Momjian br...@momjian.us написал(а): On Wed, Mar 4, 2015 at 01:53:47PM +0300, Vladimir Borodin wrote: After running initdb to create the new master, but before running pg_upgrade, modify the new master's postgresql.conf and change wal_level = hot_standby. (Don't modify pg_hba.conf at this stage.) That does not help. The reason is that pg_upgrade sets 'Current wal_level setting: minimal' in control-file, and it does not depend on what is set in postgresql.conf before running pg_upgrade. Details could be seen here - http:// pastie.org/9998671. Well, what is happening is that the pg_resetxlog commands we run inside pg_upgrade set the pg_controldata file's wal_level to minimal, but as you saw, starting the server sets the pg_controldata properly. pg_resetxlog is not changing the WAL files at all, just the control file. The workaround for this is to start and cleanly shut down postgres on master after running pg_upgrade but before running rsync. After that there would be a good control-file for streaming replication and rsync to replica can be done. You are correct that a pg_controldata file is copied over that has wal_level=minimal, but that should not be a problem. I suppose, this is the root cause of why replica does not start as hot standby. It it enough to start it as warm standby, but not hot standby. See CheckRequiredParameterValues function in xlog.c which is called inside of StartupXLOG function. But it could not be done with --size-only key, because control-file is of fixed size and rsync would skip it. Or may be everything should be copied with --size-only and control-file should be copied without this option. Well, what happens is that there is no _new_ standby pg_controldata file, so it is copied fully from the new master. Are you running initdb to create the new standby --- you shouldn't be doing that as the rsync will do that for you. No, I don’t. The scenario of the problem with copying control-file was in case when I: 1. ran pg_upgrade on master and got control-file with wal_level = minimal, 2. did rsync --size-only to replica (and it got this control-file with wal_level = minimal), 3. started and stopped postgres on master to get «good» control-file with wal_level = hot_standby», 4. did rsync --size-only to replica one more time. And this time control-file is not copied because of the same size of control-file. Actually, if you don’t do step 2, everything works as expected. Sorry for bothering you. Also, are you cleanly shutting down all the servers, or using pg_ctl -m immediate? I use init-script, it shuts down cleanly with -m fast. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- May the force be with you… https://simply.name
Re: [HACKERS] pg_upgrade and rsync
3 марта 2015 г., в 18:01, Bruce Momjian br...@momjian.us написал(а): On Tue, Mar 3, 2015 at 04:55:56PM +0300, Vladimir Borodin wrote: OK, hmmm. Thanks for testing. It feels like you didn't have your new master set up for streaming replication when you ran pg_upgrade. Is that correct? Should I specify that specifically in the instructions? After running pg_upgrade I do put in new PGDATA on master old pg_hba.conf and postgresql.conf with wal_level = hot_standby. The full content of postgresql.conf could be seen here - http://pastie.org/9995902. Then I do rsync to replica, put recovery.conf and try to start both - first master, then replica. If I turn off hot_standby in replica configuration, it starts. What am I doing wrong? After running initdb to create the new master, but before running pg_upgrade, modify the new master's postgresql.conf and change wal_level = hot_standby. (Don't modify pg_hba.conf at this stage.) That does not help. The reason is that pg_upgrade sets 'Current wal_level setting: minimal' in control-file, and it does not depend on what is set in postgresql.conf before running pg_upgrade. Details could be seen here - http://pastie.org/9998671 http://pastie.org/9998671. The workaround for this is to start and cleanly shut down postgres on master after running pg_upgrade but before running rsync. After that there would be a good control-file for streaming replication and rsync to replica can be done. But it could not be done with --size-only key, because control-file is of fixed size and rsync would skip it. Or may be everything should be copied with --size-only and control-file should be copied without this option. I didn't think that was necessary, but this might be some 9.3-specific problem, but let's get it working first. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Да пребудет с вами сила… https://simply.name/ru
Re: [HACKERS] pg_upgrade and rsync
On Tue, Mar 3, 2015 at 04:55:56PM +0300, Vladimir Borodin wrote: OK, hmmm. Thanks for testing. It feels like you didn't have your new master set up for streaming replication when you ran pg_upgrade. Is that correct? Should I specify that specifically in the instructions? After running pg_upgrade I do put in new PGDATA on master old pg_hba.conf and postgresql.conf with wal_level = hot_standby. The full content of postgresql.conf could be seen here - http://pastie.org/9995902. Then I do rsync to replica, put recovery.conf and try to start both - first master, then replica. If I turn off hot_standby in replica configuration, it starts. What am I doing wrong? After running initdb to create the new master, but before running pg_upgrade, modify the new master's postgresql.conf and change wal_level = hot_standby. (Don't modify pg_hba.conf at this stage.) I didn't think that was necessary, but this might be some 9.3-specific problem, but let's get it working first. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
2 марта 2015 г., в 21:28, Bruce Momjian br...@momjian.us написал(а): On Tue, Feb 24, 2015 at 12:13:17PM +0300, Vladimir Borodin wrote: 20 февр. 2015 г., в 18:21, Bruce Momjian br...@momjian.us написал(а): On Fri, Feb 20, 2015 at 09:45:08AM -0500, Bruce Momjian wrote: #3 bothered me as well because it was not specific enough. I like what you've added to clarify the procedure. Good. It took me a while to understand why they have to be in sync --- because we are using rsync in size-only-comparison mode, if they are not in sync we might update some files whose sizes changed, but not others, and the old slave would be broken. The new slave is going to get all new files or hard links for user files, so it would be fine, but we should be able to fall back to the old slaves, and having them in sync allows that. Also, since there was concern about the instructions, I am thinking of applying the patch only to head for 9.5, and then blog about it if people want to test it. Am I right that if you are using hot standby with both streaming replication and WAL shipping you do still need to take full backup of master after using pg_upgrade? No, you would not need to take a full backup if you use these instructions. Although it would be applied to documentation for 9.5 only, are these instructions applicable for upgrading from 9.3.6 to 9.4.1? Following the instructions from patch I’ve got following errors in postgresql.log of replica after trying to start it with hot_standby = on: 2015-02-24 11:47:22.861 MSK WARNING: WAL was generated with wal_level=minimal, data may be missing 2015-02-24 11:47:22.861 MSK HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-02-24 11:47:22.861 MSK FATAL: hot standby is not possible because wal_level was not set to hot_standby or higher on the master server 2015-02-24 11:47:22.861 MSK HINT: Either set wal_level to hot_standby on the master, or turn off hot_standby here. 2015-02-24 11:47:22.862 MSK LOG: startup process (PID 28093) exited with exit code 1 2015-02-24 11:47:22.862 MSK LOG: aborting startup due to startup process failure -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Да пребудет с вами сила… https://simply.name/ru
Re: [HACKERS] pg_upgrade and rsync
On Tue, Mar 3, 2015 at 08:38:50AM -0500, Bruce Momjian wrote: 2015-02-24 11:47:22.861 MSK WARNING: WAL was generated with wal_level= minimal, data may be missing 2015-02-24 11:47:22.861 MSK HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-02-24 11:47:22.861 MSK FATAL: hot standby is not possible because wal_level was not set to hot_standby or higher on the master server 2015-02-24 11:47:22.861 MSK HINT: Either set wal_level to hot_standby on the master, or turn off hot_standby here. 2015-02-24 11:47:22.862 MSK LOG: startup process (PID 28093) exited with exit code 1 2015-02-24 11:47:22.862 MSK LOG: aborting startup due to startup process failure OK, hmmm. Thanks for testing. It feels like you didn't have your new master set up for streaming replication when you ran pg_upgrade. Is that correct? Should I specify that specifically in the instructions? Actually, I think you are on to something that needs to be documented. Because the old and new clusters might be using the same port number, you can't configure the new master to use streaming replication because you can't be shipping those logs to the old standby. Yikes. OK, I think we need to document that you need to set wal_level=hot_standby on the new master, but not set up streaming. Once you are done the upgrade, you should configure streaming. If this fixes the problem, I will generate an updated documentation patch. Please let me know. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Tue, Mar 3, 2015 at 11:38:58AM +0300, Vladimir Borodin wrote: No, you would not need to take a full backup if you use these instructions. Although it would be applied to documentation for 9.5 only, are these instructions applicable for upgrading from 9.3.6 to 9.4.1? Yes. They work all the way back to 9.0. Following the instructions from patch I’ve got following errors in postgresql.log of replica after trying to start it with hot_standby = on: 2015-02-24 11:47:22.861 MSK WARNING: WAL was generated with wal_level= minimal, data may be missing 2015-02-24 11:47:22.861 MSK HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-02-24 11:47:22.861 MSK FATAL: hot standby is not possible because wal_level was not set to hot_standby or higher on the master server 2015-02-24 11:47:22.861 MSK HINT: Either set wal_level to hot_standby on the master, or turn off hot_standby here. 2015-02-24 11:47:22.862 MSK LOG: startup process (PID 28093) exited with exit code 1 2015-02-24 11:47:22.862 MSK LOG: aborting startup due to startup process failure OK, hmmm. Thanks for testing. It feels like you didn't have your new master set up for streaming replication when you ran pg_upgrade. Is that correct? Should I specify that specifically in the instructions? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
3 марта 2015 г., в 16:38, Bruce Momjian br...@momjian.us написал(а): On Tue, Mar 3, 2015 at 11:38:58AM +0300, Vladimir Borodin wrote: No, you would not need to take a full backup if you use these instructions. Although it would be applied to documentation for 9.5 only, are these instructions applicable for upgrading from 9.3.6 to 9.4.1? Yes. They work all the way back to 9.0. Following the instructions from patch I’ve got following errors in postgresql.log of replica after trying to start it with hot_standby = on: 2015-02-24 11:47:22.861 MSK WARNING: WAL was generated with wal_level= minimal, data may be missing 2015-02-24 11:47:22.861 MSK HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup. 2015-02-24 11:47:22.861 MSK FATAL: hot standby is not possible because wal_level was not set to hot_standby or higher on the master server 2015-02-24 11:47:22.861 MSK HINT: Either set wal_level to hot_standby on the master, or turn off hot_standby here. 2015-02-24 11:47:22.862 MSK LOG: startup process (PID 28093) exited with exit code 1 2015-02-24 11:47:22.862 MSK LOG: aborting startup due to startup process failure OK, hmmm. Thanks for testing. It feels like you didn't have your new master set up for streaming replication when you ran pg_upgrade. Is that correct? Should I specify that specifically in the instructions? After running pg_upgrade I do put in new PGDATA on master old pg_hba.conf and postgresql.conf with wal_level = hot_standby. The full content of postgresql.conf could be seen here - http://pastie.org/9995902 http://pastie.org/9995902. Then I do rsync to replica, put recovery.conf and try to start both - first master, then replica. If I turn off hot_standby in replica configuration, it starts. What am I doing wrong? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Да пребудет с вами сила… https://simply.name/ru
Re: [HACKERS] pg_upgrade and rsync
On Tue, Feb 24, 2015 at 12:13:17PM +0300, Vladimir Borodin wrote: 20 февр. 2015 г., в 18:21, Bruce Momjian br...@momjian.us написал(а): On Fri, Feb 20, 2015 at 09:45:08AM -0500, Bruce Momjian wrote: #3 bothered me as well because it was not specific enough. I like what you've added to clarify the procedure. Good. It took me a while to understand why they have to be in sync --- because we are using rsync in size-only-comparison mode, if they are not in sync we might update some files whose sizes changed, but not others, and the old slave would be broken. The new slave is going to get all new files or hard links for user files, so it would be fine, but we should be able to fall back to the old slaves, and having them in sync allows that. Also, since there was concern about the instructions, I am thinking of applying the patch only to head for 9.5, and then blog about it if people want to test it. Am I right that if you are using hot standby with both streaming replication and WAL shipping you do still need to take full backup of master after using pg_upgrade? No, you would not need to take a full backup if you use these instructions. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Thu, Feb 19, 2015 at 09:35:02PM -0500, David Steele wrote: On 2/19/15 11:57 AM, Bruce Momjian wrote: On Wed, Jan 28, 2015 at 09:26:11PM -0800, Josh Berkus wrote: 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. Now that 9.4.1 is released, I would like to get this doc patch applied --- it will close the often-requested feature of how to pg_upgrade slave clusters. I wasn't happy with Josh's specification above that the replica is not very lagged, so I added a bullet point to check the pg_controldata output to verify that the primary and standby servers are synchronized. Yes, this adds even more complication to the pg_upgrade instructions, but it is really more of the same complexity. pg_upgrade really needs an install-aware and OS-aware tool on top of it to automate much of this. #3 bothered me as well because it was not specific enough. I like what you've added to clarify the procedure. Good. It took me a while to understand why they have to be in sync --- because we are using rsync in size-only-comparison mode, if they are not in sync we might update some files whose sizes changed, but not others, and the old slave would be broken. The new slave is going to get all new files or hard links for user files, so it would be fine, but we should be able to fall back to the old slaves, and having them in sync allows that. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Feb 20, 2015 at 09:45:08AM -0500, Bruce Momjian wrote: #3 bothered me as well because it was not specific enough. I like what you've added to clarify the procedure. Good. It took me a while to understand why they have to be in sync --- because we are using rsync in size-only-comparison mode, if they are not in sync we might update some files whose sizes changed, but not others, and the old slave would be broken. The new slave is going to get all new files or hard links for user files, so it would be fine, but we should be able to fall back to the old slaves, and having them in sync allows that. Also, since there was concern about the instructions, I am thinking of applying the patch only to head for 9.5, and then blog about it if people want to test it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2/19/15 11:57 AM, Bruce Momjian wrote: On Wed, Jan 28, 2015 at 09:26:11PM -0800, Josh Berkus wrote: 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. Now that 9.4.1 is released, I would like to get this doc patch applied --- it will close the often-requested feature of how to pg_upgrade slave clusters. I wasn't happy with Josh's specification above that the replica is not very lagged, so I added a bullet point to check the pg_controldata output to verify that the primary and standby servers are synchronized. Yes, this adds even more complication to the pg_upgrade instructions, but it is really more of the same complexity. pg_upgrade really needs an install-aware and OS-aware tool on top of it to automate much of this. #3 bothered me as well because it was not specific enough. I like what you've added to clarify the procedure. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On Wed, Jan 28, 2015 at 09:26:11PM -0800, Josh Berkus wrote: So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. So, here's the correct steps for Bruce, because his current doc does not cover all of these. I really think this should go in as a numbered set of steps; the current doc has some steps as steps, and other stuff buried in paragraphs. 1. Install the new version binaries on both servers, alongside the old version. 2. If not done by the package install, initdb the new version's data directory. 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. Now that 9.4.1 is released, I would like to get this doc patch applied --- it will close the often-requested feature of how to pg_upgrade slave clusters. I wasn't happy with Josh's specification above that the replica is not very lagged, so I added a bullet point to check the pg_controldata output to verify that the primary and standby servers are synchronized. Yes, this adds even more complication to the pg_upgrade instructions, but it is really more of the same complexity. pg_upgrade really needs an install-aware and OS-aware tool on top of it to automate much of this. Patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server just long enough to do a second applicationrsync/. The !second applicationrsync/ will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server long enough to do an commandrsync --checksum/. !(option--checksum/ is necessary because commandrsync/ only !has file modification-time granularity of one second.) The !second applicationrsync/ will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..d1c26df *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** NET STOP postgresql-8.4 *** 315,320 --- 315,324 NET STOP postgresql-9.0 /programlisting /para + + para + Log-shipping standby servers can remain running until a later step. + /para /step step *** pg_upgrade.exe *** 399,404 --- 403,525 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed on all +standby servers. + /para + /step + + step + titleMake sure the new standby data directories do emphasisnot/ + exist/title + + para +Make sure the new standby data directories do emphasisnot/ +exist or are empty. If applicationinitdb/ was run, delete +the standby server data directories. + /para + /step + + step + titleInstall custom shared object files/title + + para +Install the same custom shared object files on the new standbys +that you installed in the new master cluster. + /para + /step + + step + titleStop standby servers/title + + para +If the standby servers are still running, stop them now using the +above instructions. + /para + /step + + step +
Re: [HACKERS] pg_upgrade and rsync
On Wed, Jan 28, 2015 at 09:26:11PM -0800, Josh Berkus wrote: 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. Is this necessary. It seems quite imprecise too. 4. Shut down the master using -m fast or -m smart for a clean shutdown. It is not necessary to shut down the replicas yet. We already give instructions on how to shut down the server in the pg_ugprade docs. 5. pg_upgrade the master using the --link option. Do not start the new version yet. Stephen mentioned that --link is not clear in the old docs --- I fixed that. 6. create a data directory for the new version on the replica. This directory should be empty; if it was initdb'd by the installation package, then delete its contents. rsync will create this for you. 10. Start the master, then the replica I have incorporated all your suggestions in the attached patch. I also split items into separate sections as you suggested. You can read the end result here: http://momjian.us/tmp/pgsql/pgupgrade.html -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server just long enough to do a second applicationrsync/. The !second applicationrsync/ will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server long enough to do an commandrsync --checksum/. !(option--checksum/ is necessary because commandrsync/ only !has file modification-time granularity of one second.) The !second applicationrsync/ will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..a97a393 *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** NET STOP postgresql-8.4 *** 315,320 --- 315,324 NET STOP postgresql-9.0 /programlisting /para + + para + Log-shipping standby servers can remain running until a later step. + /para /step step *** pg_upgrade.exe *** 399,404 --- 403,513 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed on all +standby servers. + /para + /step + + step + titleMake sure the new standby data directories do emphasisnot/ + exist/title + + para +Make sure the new standby data directories do emphasisnot/ +exist or are empty. If applicationinitdb/ was run, delete +the standby server data directories. + /para + /step + + step + titleInstall custom shared object files/title + + para +Install the same custom shared object files on the new standbys +that you installed in the new master cluster. + /para + /step + + step + titleStop standby servers/title + + para +If the standby servers are still running, stop them now using the +above instructions. + /para + /step + + step + titleSave configuration files/title + + para +Save any configuration files from the standbys you need to keep, +e.g. filenamepostgresql.conf/, literalrecovery.conf/, +as these will be overwritten or removed in the next step. + /para + /step + + step + titleRun applicationrsync//title + + para +From a directory that is above the old and new database cluster +directories, run this
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 29, 2015 at 10:21:30AM -0500, Andrew Dunstan wrote: On 01/29/2015 12:26 AM, Josh Berkus wrote: So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. So, here's the correct steps for Bruce, because his current doc does not cover all of these. I really think this should go in as a numbered set of steps; the current doc has some steps as steps, and other stuff buried in paragraphs. 1. Install the new version binaries on both servers, alongside the old version. 2. If not done by the package install, initdb the new version's data directory. 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. 4. Shut down the master using -m fast or -m smart for a clean shutdown. It is not necessary to shut down the replicas yet. 5. pg_upgrade the master using the --link option. Do not start the new version yet. 6. create a data directory for the new version on the replica. This directory should be empty; if it was initdb'd by the installation package, then delete its contents. 7. shut down postgres on the replica. 8. rsync both the old and new data directories from the master to the replica, using the --size-only and -H hard links options. For example, if both 9.3 and 9.4 are in /var/lib/postgresql, do: rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/ replica-host:/var/lib/postgresql/ 9. Create a recovery.conf file in the replica's data directory with the appropriate parameters. 10. Start the master, then the replica I find steps 2 and 6 confusing. For number 2, he is creating a new cluster on the master server. For #6, he is just creating an empty data directory, though this is not required as rsync will create the directory for you. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Tue, Jan 27, 2015 at 10:16:48PM -0500, David Steele wrote: This is definitely an edge case. Not only does the file have to be modified in the same second *after* rsync has done the copy, but the file also has to not be modified in *any other subsequent second* before the next incremental backup. If the file is busy enough to have a collision with rsync in that second, then it is very likely to be modified before the next incremental backup which is generally a day or so later. And, of course, the backup where the issue occurs is fine - it's the next backup that is invalid. However, the hot/cold backup scheme as documented does make the race condition more likely since the two backups are done in close proximity temporally. Ultimately, the most reliable method is to use checksums. For me the biggest issue is that there is no way to discover if a db in consistent no matter how much time/resources you are willing to spend. I could live with the idea of the occasional bad backup (since I keep as many as possible), but having no way to know whether it is good or not is very frustrating. I know data checksums are a step in that direction, but they are a long way from providing the optimal solution. I've implemented rigorous checksums in PgBackRest but something closer to the source would be even better. Agreed. I have update the two mentions of rsync in our docs to clarify this. Thank you. The patch also has pg_upgrade doc improvements suggested by comments from Josh Berkus. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server just long enough to do a second applicationrsync/. The !second applicationrsync/ will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 Another option is to use applicationrsync/ to perform a file system backup. This is done by first running applicationrsync/ while the database server is running, then shutting down the database !server long enough to do an commandrsync --checksum/. !(option--checksum/ is necessary because commandrsync/ only !has file modification-time granularity of one second.) The !second applicationrsync/ will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..ed65def *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** pg_upgrade.exe *** 409,414 --- 409,504 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed +on all the standby servers. Do emphasisnot/ run +applicationinitdb/. If applicationinitdb/ was run, delete +the standby server data directories. Also, install any custom +shared object files on the new standbys that you installed in the +new master cluster. + /para + /step + + step + titleShutdown the Standby Servers/title + + para +If the standby servers are still running, shut them down. Save any +configuration files from the standbys you need to keep, e.g. +filenamepostgresql.conf/, literalrecovery.conf/, as these +will be overwritten or removed in the next step. + /para + /step + + step + titleRun applicationrsync//title + + para +From a directory that is above the old and new database cluster +directories, run this for each slave: + + programlisting +rsync --archive --hard-links --size-only old_pgdata new_pgdata remote_dir + /programlisting + +where optionold_pgdata/ and optionnew_pgdata/ are relative +to
Re: [HACKERS] pg_upgrade and rsync
On 01/29/2015 12:26 AM, Josh Berkus wrote: So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. So, here's the correct steps for Bruce, because his current doc does not cover all of these. I really think this should go in as a numbered set of steps; the current doc has some steps as steps, and other stuff buried in paragraphs. 1. Install the new version binaries on both servers, alongside the old version. 2. If not done by the package install, initdb the new version's data directory. 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. 4. Shut down the master using -m fast or -m smart for a clean shutdown. It is not necessary to shut down the replicas yet. 5. pg_upgrade the master using the --link option. Do not start the new version yet. 6. create a data directory for the new version on the replica. This directory should be empty; if it was initdb'd by the installation package, then delete its contents. 7. shut down postgres on the replica. 8. rsync both the old and new data directories from the master to the replica, using the --size-only and -H hard links options. For example, if both 9.3 and 9.4 are in /var/lib/postgresql, do: rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/ replica-host:/var/lib/postgresql/ 9. Create a recovery.conf file in the replica's data directory with the appropriate parameters. 10. Start the master, then the replica I find steps 2 and 6 confusing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/29/2015 11:34 AM, Bruce Momjian wrote: On Thu, Jan 29, 2015 at 10:21:30AM -0500, Andrew Dunstan wrote: On 01/29/2015 12:26 AM, Josh Berkus wrote: So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. So, here's the correct steps for Bruce, because his current doc does not cover all of these. I really think this should go in as a numbered set of steps; the current doc has some steps as steps, and other stuff buried in paragraphs. 1. Install the new version binaries on both servers, alongside the old version. 2. If not done by the package install, initdb the new version's data directory. 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. 4. Shut down the master using -m fast or -m smart for a clean shutdown. It is not necessary to shut down the replicas yet. 5. pg_upgrade the master using the --link option. Do not start the new version yet. 6. create a data directory for the new version on the replica. This directory should be empty; if it was initdb'd by the installation package, then delete its contents. 7. shut down postgres on the replica. 8. rsync both the old and new data directories from the master to the replica, using the --size-only and -H hard links options. For example, if both 9.3 and 9.4 are in /var/lib/postgresql, do: rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/ replica-host:/var/lib/postgresql/ 9. Create a recovery.conf file in the replica's data directory with the appropriate parameters. 10. Start the master, then the replica I find steps 2 and 6 confusing. For number 2, he is creating a new cluster on the master server. For #6, he is just creating an empty data directory, though this is not required as rsync will create the directory for you. Then step 2 should specify that it's for the master. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote: 7. shut down postgres on the replica. 8. rsync both the old and new data directories from the master to the replica, using the --size-only and -H hard links options. For example, if both 9.3 and 9.4 are in /var/lib/postgresql, do: rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/ replica-host:/var/lib/postgresql/ 9. Create a recovery.conf file in the replica's data directory with the appropriate parameters. 10. Start the master, then the replica I find steps 2 and 6 confusing. For number 2, he is creating a new cluster on the master server. For #6, he is just creating an empty data directory, though this is not required as rsync will create the directory for you. Then step 2 should specify that it's for the master. Right. Josh is just listing all the steps --- the pg_upgrade docs already have that spelled out in detail. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/29/2015 09:11 AM, Bruce Momjian wrote: On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote: Then step 2 should specify that it's for the master. Right. Josh is just listing all the steps --- the pg_upgrade docs already have that spelled out in detail. What I'm also saying is that, if we expect anyone to be able to follow all of these steps, it has to be very explicit; just saying Follow the pg_upgrade docs but don't start the master yet isn't clear enough, because the pg_upgrade docs have a few alternative paths. On the whole, I personally would never follow this procedure at a production site. It's way too fragile and easy to screw up. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 12:42 PM, Josh Berkus wrote: On 01/29/2015 09:11 AM, Bruce Momjian wrote: On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote: Then step 2 should specify that it's for the master. Right. Josh is just listing all the steps --- the pg_upgrade docs already have that spelled out in detail. What I'm also saying is that, if we expect anyone to be able to follow all of these steps, it has to be very explicit; just saying Follow the pg_upgrade docs but don't start the master yet isn't clear enough, because the pg_upgrade docs have a few alternative paths. On the whole, I personally would never follow this procedure at a production site. It's way too fragile and easy to screw up. I'm in agreement with Josh - I would not use this method. I may be wrong, but it makes me extremely nervous. I prefer to upgrade the primary and get it back up as soon as possible, then take a backup and restore it to the replicas. If the replicas are being used for read-only queries instead of just redundancy then I redirect that traffic to the primary while the replicas are being upgraded and restored. This method has the least downtime for the primary. If you want less downtime overall then it's best to use the hot rsync / cold rsync with checksums method, though this depends a lot on the size of your database. Ultimately, there is no single best method. It depends a lot on your environment. I would prefer the official documents to contain very safe methods. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 7:02 PM, David Steele wrote: On 1/29/15 7:55 PM, Jim Nasby wrote: On 1/29/15 6:25 PM, David Steele wrote: Safe backups can be done without LSNs provided you are willing to trust your timestamps. Which AFAICT simply isn't safe to do at all... except maybe with the manifest stuff you've talked about? Yes - that's what I'm talking about. I had hoped to speak about this at PgConfNYC, but perhaps I can do it in a lightning talk instead. Sounds like maybe it should be part of our documentation too... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 5:53 PM, David Steele wrote: On 1/29/15 12:42 PM, Josh Berkus wrote: On 01/29/2015 09:11 AM, Bruce Momjian wrote: On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote: Then step 2 should specify that it's for the master. Right. Josh is just listing all the steps --- the pg_upgrade docs already have that spelled out in detail. What I'm also saying is that, if we expect anyone to be able to follow all of these steps, it has to be very explicit; just saying Follow the pg_upgrade docs but don't start the master yet isn't clear enough, because the pg_upgrade docs have a few alternative paths. On the whole, I personally would never follow this procedure at a production site. It's way too fragile and easy to screw up. I'm in agreement with Josh - I would not use this method. I may be wrong, but it makes me extremely nervous. I prefer to upgrade the primary and get it back up as soon as possible, then take a backup and restore it to the replicas. If the replicas are being used for read-only queries instead of just redundancy then I redirect that traffic to the primary while the replicas are being upgraded and restored. This method has the least downtime for the primary. If you want less downtime overall then it's best to use the hot rsync / cold rsync with checksums method, though this depends a lot on the size of your database. Ultimately, there is no single best method. It depends a lot on your environment. I would prefer the official documents to contain very safe methods. How do we define safe though? Your method leaves you without a backup server until your base backup completes and the replica catches up. I think we do a dis-service to our users by not pointing that out and providing a potential alternate *so long as we spell out the tradeoffs/risks*. Ultimately, I think this thread really shows the very large need for a tool that understands things like LSNs to provide rsync-ish behavior that's actually safe. FWIW, I personally am very leery of relying on pg_upgrade. It's too easy to introduce bugs, doesn't handle all cases, and provides no option for going back to your previous version without losing data. I much prefer old_version -- londiste -- new_version, and then doing the upgrade by reversing the direction of replication. I also don't entirely trust PITR backups. It's too easy to accidentally break them in subtle ways. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 6:25 PM, David Steele wrote: Safe backups can be done without LSNs provided you are willing to trust your timestamps. Which AFAICT simply isn't safe to do at all... except maybe with the manifest stuff you've talked about? FWIW, I personally am very leery of relying on pg_upgrade. It's too easy to introduce bugs, doesn't handle all cases, and provides no option for going back to your previous version without losing data. I much prefer old_version -- londiste -- new_version, and then doing the upgrade by reversing the direction of replication. I think the official docs need to stick with options that are core? I don't think we have any such requirement. IIRC the docs used to talk about using logical replication before we had pg_upgrade (and may have actually called out Slony). I avoid pg_upgrade wherever it is practical. However, sometimes it really is the best option. Certainly. I think what we should be doing is spelling out the available options (with pros/cons) so that users can decide what's best. I also don't entirely trust PITR backups. It's too easy to accidentally break them in subtle ways. Agreed in general, but I've been doing a lot of work to make this not be true anymore. :) I'd love to see all this stuff Just Work (tm), but I don't think we're there yet, and I'm not really sure how we can get there. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 11:34 AM, Bruce Momjian wrote: 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. I think I'd want a something a bit more specific here. When the primary shuts down it will kick out one last WAL. The filename should be recorded. 7. shut down postgres on the replica. Before the shutdown make sure that the replicas are waiting on the subsequent log file to appear (note that versions prior to 9.3 skip 00). That means all WAL has been consumed and the primary and replica(s) are in the same state. This is a bit more complex if streaming replication is being used *without* good old fashioned log shipping to a backup server and I'm not sure exactly how to go about it. I suppose you could start Postgres in single user mode, commit a transaction, and make sure that transaction gets to the replicas. OTOH, streaming replication (unless it is synchronous) would be crazy without doing WAL backup. Maybe that's just me. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 7:55 PM, Jim Nasby wrote: On 1/29/15 6:25 PM, David Steele wrote: Safe backups can be done without LSNs provided you are willing to trust your timestamps. Which AFAICT simply isn't safe to do at all... except maybe with the manifest stuff you've talked about? Yes - that's what I'm talking about. I had hoped to speak about this at PgConfNYC, but perhaps I can do it in a lightning talk instead. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 10:13 AM, Bruce Momjian wrote: Agreed. I have update the two mentions of rsync in our docs to clarify this. Thank you. The patch also has pg_upgrade doc improvements suggested by comments from Josh Berkus. It's very good to see this. Mentions of this rsync vulnerability are few and far between. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 7:07 PM, Jim Nasby wrote: Ultimately, there is no single best method. It depends a lot on your environment. I would prefer the official documents to contain very safe methods. How do we define safe though? Your method leaves you without a backup server until your base backup completes and the replica catches up. I think we do a dis-service to our users by not pointing that out and providing a potential alternate *so long as we spell out the tradeoffs/risks*. My method leaves you without a replica, but not without a *backup* as long as you are shipping WAL somewhere safe. You can set archive_timeout to something small if you want to make this safer. This is more practical in 9.4 since unused WAL space is zeroed. OK, I'm willing to admit it would be better to have the option with all caveats, so long as they are strongly worded. Ultimately, I think this thread really shows the very large need for a tool that understands things like LSNs to provide rsync-ish behavior that's actually safe. Safe backups can be done without LSNs provided you are willing to trust your timestamps. FWIW, I personally am very leery of relying on pg_upgrade. It's too easy to introduce bugs, doesn't handle all cases, and provides no option for going back to your previous version without losing data. I much prefer old_version -- londiste -- new_version, and then doing the upgrade by reversing the direction of replication. I think the official docs need to stick with options that are core? I avoid pg_upgrade wherever it is practical. However, sometimes it really is the best option. I also don't entirely trust PITR backups. It's too easy to accidentally break them in subtle ways. Agreed in general, but I've been doing a lot of work to make this not be true anymore. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/29/15 8:09 PM, Jim Nasby wrote: On 1/29/15 7:02 PM, David Steele wrote: On 1/29/15 7:55 PM, Jim Nasby wrote: On 1/29/15 6:25 PM, David Steele wrote: Safe backups can be done without LSNs provided you are willing to trust your timestamps. Which AFAICT simply isn't safe to do at all... except maybe with the manifest stuff you've talked about? Yes - that's what I'm talking about. I had hoped to speak about this at PgConfNYC, but perhaps I can do it in a lightning talk instead. Sounds like maybe it should be part of our documentation too... I think the warnings Bruce has added to the documentation about using checksums are sufficient for now. The manifest build and delay methodology are part of PgBackRest, the backup solution I'm working on as an alternative to barman, etc. It's not something that can be implemented trivially. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
* Jim Nasby (jim.na...@bluetreble.com) wrote: On 1/27/15 9:29 AM, Stephen Frost wrote: My point is that Bruce's patch suggests looking for remote_dir in the rsync documentation, but no such term appears there. Ah, well, perhaps we could simply add a bit of clarification to this: for details on specifying optionremote_dir/ The whole remote_dir discussion made me think of something... would --link-dest be any help here? No. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
Bruce, * Bruce Momjian (br...@momjian.us) wrote: On Tue, Jan 27, 2015 at 09:36:58AM -0500, Stephen Frost wrote: The example listed works, but only when it's a local rsync: rsync --archive --hard-links --size-only old_dir new_dir remote_dir Perhaps a better example (or additional one) would be with a remote rsync, including clarification of old and new dir, like so: (run in /var/lib/postgresql) rsync --archive --hard-links --size-only \ 9.3/main \ 9.4/main \ server:/var/lib/postgresql/ Note that 9.3/main and 9.4/main are two source directories for rsync to copy over, while server:/var/lib/postgresql/ is a remote destination directory. The above directories match a default Debian/Ubuntu install. OK, sorry everyone was confused by 'remote_dir'. Does this new patch help? Looks better, but --links is not the same as --hard-links. The example is right, the but documentation below it mentions option--link/ which is for symlinks, not hard links. This also should really include a discussion about dealing with tablespaces, since the example command won't deal with them. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
* Bruce Momjian (br...@momjian.us) wrote: Interesting problem, but doesn't rsync use sub-second accuracy? No. Simple test will show: touch xx/aa ; rsync -avv xx yy ; sleep 0.5 ; touch xx/aa ; rsync -avv xx yy Run that a few times and you'll see it report xx/aa is uptodate sometimes, depending on when exactly where the sleep falls during the second. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
Bruce, Stephen, etc.: So, I did a test trial of this and it seems like it didn't solve the issue of huge rsyncs. That is, the only reason to do this whole business via rsync, instead of doing a new basebackup of each replica, is to cut down on data transfer time by not resyncing the data from the old base directory. But in practice, the majority of the database files seem like they get transmitted anyway. Maybe I'm misreading the rsync ouput? Here's the setup: 3 Ubuntu 14.04 servers on AWS (tiny instance) Running PostgreSQL 9.3.5 Set up in cascading replication 108 -- 107 -- 109 The goal was to test this with cascading, but I didn't get that far. I set up a pgbench workload, read-write on the master and read-only on the two replicas, to simulate a load-balanced workload. I was *not* logging hint bits. I then followed this sequence: 1) Install 9.4 packages on all servers. 2) Shut down the master. 3) pg_upgrade the master using --link 4) shut down replica 107 5) rsync the master's $PGDATA from the replica: rsync -aHv --size-only -e ssh --itemize-changes 172.31.4.108:/var/lib/postgresql/ /var/lib/postgresql/ ... and got: .d..t.. 9.4/main/pg_xlog/ f+ 9.4/main/pg_xlog/0007000100CB .d..t.. 9.4/main/pg_xlog/archive_status/ sent 126892 bytes received 408645000 bytes 7640596.11 bytes/sec total size is 671135675 speedup is 1.64 So that's 390MB of data transfer. If I look at the original directory: postgres@paul: du --max-depth=1 -h 4.0K./.cache 20K ./.ssh 424M./9.3 4.0K./.emacs.d 51M ./9.4 56K ./bench 474M. So 390MB were transferred out of a possible 474MB. That certainly seems like we're still transferring the majority of the data, even though I verified that the hard links are being sent as hard links. No? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/28/2015 02:10 PM, Josh Berkus wrote: So 390MB were transferred out of a possible 474MB. That certainly seems like we're still transferring the majority of the data, even though I verified that the hard links are being sent as hard links. No? Looks like the majority of that was pg_xlog. Going to tear this down and start over, and --exclude pg_xlog. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/28/2015 02:28 PM, Josh Berkus wrote: On 01/28/2015 02:10 PM, Josh Berkus wrote: So 390MB were transferred out of a possible 474MB. That certainly seems like we're still transferring the majority of the data, even though I verified that the hard links are being sent as hard links. No? Looks like the majority of that was pg_xlog. Going to tear this down and start over, and --exclude pg_xlog. So, having redone this without the pg_xlog lag, this appears to work in terms of cutting down the rsync volume. I'm concerned about putting this in the main docs, though. This is a complex, and fragile procedure, which is very easy to get wrong, and hard to explain for a generic case. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Josh Berkus (j...@agliodbs.com) wrote: On 01/28/2015 02:28 PM, Josh Berkus wrote: On 01/28/2015 02:10 PM, Josh Berkus wrote: So 390MB were transferred out of a possible 474MB. That certainly seems like we're still transferring the majority of the data, even though I verified that the hard links are being sent as hard links. No? Looks like the majority of that was pg_xlog. Going to tear this down and start over, and --exclude pg_xlog. So, having redone this without the pg_xlog lag, this appears to work in terms of cutting down the rsync volume. I'm concerned about putting this in the main docs, though. This is a complex, and fragile procedure, which is very easy to get wrong, and hard to explain for a generic case. So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
So, for my 2c, I'm on the fence about it. On the one hand, I agree, it's a bit of a complex process to get right. On the other hand, it's far better if we put something out there along the lines of if you really want to, this is how to do it than having folks try to fumble through to find the correct steps themselves. So, here's the correct steps for Bruce, because his current doc does not cover all of these. I really think this should go in as a numbered set of steps; the current doc has some steps as steps, and other stuff buried in paragraphs. 1. Install the new version binaries on both servers, alongside the old version. 2. If not done by the package install, initdb the new version's data directory. 3. Check that the replica is not very lagged. If it is, wait for traffic to die down and for it to catch up. 4. Shut down the master using -m fast or -m smart for a clean shutdown. It is not necessary to shut down the replicas yet. 5. pg_upgrade the master using the --link option. Do not start the new version yet. 6. create a data directory for the new version on the replica. This directory should be empty; if it was initdb'd by the installation package, then delete its contents. 7. shut down postgres on the replica. 8. rsync both the old and new data directories from the master to the replica, using the --size-only and -H hard links options. For example, if both 9.3 and 9.4 are in /var/lib/postgresql, do: rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/ replica-host:/var/lib/postgresql/ 9. Create a recovery.conf file in the replica's data directory with the appropriate parameters. 10. Start the master, then the replica -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Sat, Jan 24, 2015 at 10:04 PM, Bruce Momjian br...@momjian.us wrote: On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote: You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. That still doesn't address the need to use --size-only, it would just mean that you don't need to use -H. If anything the -H part is the aspect which worries me the least about this approach. I can now confirm that it works, just as Stephen said. I was able to upgrade a standby cluster that contained the regression database, and the pg_dump output was perfect. I am attaching doc instruction that I will add to all branches as soon as someone else confirms my results. You will need to use rsync --itemize-changes to see the hard links being created, e.g.: hf+ pgsql/data/base/16415/28188 = pgsql.old/data/base/16384/28188 My rsync manual page (on two different systems) mentions nothing about remote_dir, so I'd be quite unable to follow your proposed directions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/27/15 6:09 PM, Jim Nasby wrote: The whole remote_dir discussion made me think of something... would --link-dest be any help here? I'm pretty sure --link-dest would not be effective in this case. The problem exists on the source side and --link-dest only operates on the destination. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
Robert Haas robertmh...@gmail.com writes: On Fri, Jan 23, 2015 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote: I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. +1. That's certainly impossible for the system catalogs, which means you have to be able to deal with relfilenode discrepancies for them, which means that maintaining the same relfilenodes for user tables is of dubious value. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Tue, Jan 27, 2015 at 9:36 AM, Stephen Frost sfr...@snowman.net wrote: The example listed works, but only when it's a local rsync: rsync --archive --hard-links --size-only old_dir new_dir remote_dir Perhaps a better example (or additional one) would be with a remote rsync, including clarification of old and new dir, like so: (run in /var/lib/postgresql) rsync --archive --hard-links --size-only \ 9.3/main \ 9.4/main \ server:/var/lib/postgresql/ Note that 9.3/main and 9.4/main are two source directories for rsync to copy over, while server:/var/lib/postgresql/ is a remote destination directory. The above directories match a default Debian/Ubuntu install. My point is that Bruce's patch suggests looking for remote_dir in the rsync documentation, but no such term appears there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Tue, Jan 27, 2015 at 9:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jan 23, 2015 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote: I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. +1. That's certainly impossible for the system catalogs, which means you have to be able to deal with relfilenode discrepancies for them, which means that maintaining the same relfilenodes for user tables is of dubious value. Why is that impossible for the system catalogs? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Robert Haas (robertmh...@gmail.com) wrote: On Sat, Jan 24, 2015 at 10:04 PM, Bruce Momjian br...@momjian.us wrote: On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote: You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. That still doesn't address the need to use --size-only, it would just mean that you don't need to use -H. If anything the -H part is the aspect which worries me the least about this approach. I can now confirm that it works, just as Stephen said. I was able to upgrade a standby cluster that contained the regression database, and the pg_dump output was perfect. I am attaching doc instruction that I will add to all branches as soon as someone else confirms my results. You will need to use rsync --itemize-changes to see the hard links being created, e.g.: hf+ pgsql/data/base/16415/28188 = pgsql.old/data/base/16384/28188 My rsync manual page (on two different systems) mentions nothing about remote_dir, so I'd be quite unable to follow your proposed directions. The example listed works, but only when it's a local rsync: rsync --archive --hard-links --size-only old_dir new_dir remote_dir Perhaps a better example (or additional one) would be with a remote rsync, including clarification of old and new dir, like so: (run in /var/lib/postgresql) rsync --archive --hard-links --size-only \ 9.3/main \ 9.4/main \ server:/var/lib/postgresql/ Note that 9.3/main and 9.4/main are two source directories for rsync to copy over, while server:/var/lib/postgresql/ is a remote destination directory. The above directories match a default Debian/Ubuntu install. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/27/15 9:29 AM, Stephen Frost wrote: My point is that Bruce's patch suggests looking for remote_dir in the rsync documentation, but no such term appears there. Ah, well, perhaps we could simply add a bit of clarification to this: for details on specifying optionremote_dir/ The whole remote_dir discussion made me think of something... would --link-dest be any help here? --link-dest=DIR This option behaves like --copy-dest, but unchanged files are hard linked from DIR to the des- tination directory. The files must be identical in all preserved attributes (e.g. permis- sions, possibly ownership) in order for the files to be linked together. An example: rsync -av --link-dest=$PWD/prior_dir host:src_dir/ new_dir/ -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Mon, Jan 26, 2015 at 05:41:59PM -0500, Stephen Frost wrote: I've thought about it a fair bit actually and I agree that there is some risk to using rsync for *incremental* base backups. That is, you have a setup where you loop with: pg_start_backup rsync - dest pg_stop_backup without using -I, changing what 'dest' is, or making sure it's empty every time. The problem is the 1s-level granularity used on the timestamp. A possible set of operations, all within 1s, is: file changed rsync starts copying the file file changed again (somewhere prior to where rsync is at) rsync finishes the file copy Now, this isn't actually a problem for the first time that file is backed up- the issue is if that file isn't changed again. rsync won't re-copy it, but that change that rsync missed won't be in the WAL history for the *second* backup that's done (only the first), leading to a case where that file would end up corrupted. Interesting problem, but doesn't rsync use sub-second accuracy? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/27/15 9:51 PM, Bruce Momjian wrote: According to my empirical testing on Linux and OSX the answer is no: rsync does not use sub-second accuracy. This seems to be true even on file systems like ext4 that support millisecond mod times, at least it was true on Ubuntu 12.04 running ext4. Even on my laptop there is a full half-second of vulnerability for rsync. Faster systems may have a larger window. OK, bummer. Well, I don't think we ever recommend to run rsync without checksums, but the big problem is that rsync doesn't do checksums by default. :-( pg_upgrade recommends using two rsyncs: To make a valid copy of the old cluster, use commandrsync/ to create a dirty copy of the old cluster while the server is running, then shut down the old server and run commandrsync/ again to update the copy with any changes to make it consistent. You might want to exclude some I am afraid that will not work as it could miss changes, right? When would the default mod-time checking every be safe? According to my testing the default mod-time checking is never completely safe in rsync. I've worked around this in PgBackRest by building the manifest and then waiting until the start of the next second before starting to copy. It was the only way I could make the incremental backups reliable without requiring checksums (which are optional as in rsync for performance). Of course, you still have to trust the clock for this to work. This is definitely an edge case. Not only does the file have to be modified in the same second *after* rsync has done the copy, but the file also has to not be modified in *any other subsequent second* before the next incremental backup. If the file is busy enough to have a collision with rsync in that second, then it is very likely to be modified before the next incremental backup which is generally a day or so later. And, of course, the backup where the issue occurs is fine - it's the next backup that is invalid. However, the hot/cold backup scheme as documented does make the race condition more likely since the two backups are done in close proximity temporally. Ultimately, the most reliable method is to use checksums. For me the biggest issue is that there is no way to discover if a db in consistent no matter how much time/resources you are willing to spend. I could live with the idea of the occasional bad backup (since I keep as many as possible), but having no way to know whether it is good or not is very frustrating. I know data checksums are a step in that direction, but they are a long way from providing the optimal solution. I've implemented rigorous checksums in PgBackRest but something closer to the source would be even better. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On Fri, Jan 23, 2015 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-01-22 20:54:47 -0500, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
Robert Haas robertmh...@gmail.com writes: On Tue, Jan 27, 2015 at 9:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: That's certainly impossible for the system catalogs, which means you have to be able to deal with relfilenode discrepancies for them, which means that maintaining the same relfilenodes for user tables is of dubious value. Why is that impossible for the system catalogs? New versions aren't guaranteed to have the same system catalogs, let alone the same relfilenodes for them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-27 10:20:48 -0500, Robert Haas wrote: On Tue, Jan 27, 2015 at 9:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jan 23, 2015 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote: I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. +1. That's certainly impossible for the system catalogs, which means you have to be able to deal with relfilenode discrepancies for them, which means that maintaining the same relfilenodes for user tables is of dubious value. Why is that impossible for the system catalogs? Maybe it's not impossible for existing catalogs, but it's certainly complicated. But I don't think it's all that desirable anyway - they're not the same relation after the pg_upgrade anyway (initdb/pg_dump filled them). That's different for the user defined relations. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Tom Lane (t...@sss.pgh.pa.us) wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jan 27, 2015 at 9:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: That's certainly impossible for the system catalogs, which means you have to be able to deal with relfilenode discrepancies for them, which means that maintaining the same relfilenodes for user tables is of dubious value. Why is that impossible for the system catalogs? New versions aren't guaranteed to have the same system catalogs, let alone the same relfilenodes for them. Indeed, new versions almost certainly have wholly new system catalogs. While there might be a reason to keep the relfilenodes the same, it doesn't actually help with the pg_upgrade use-case we're currently discussing (at least, not without additional help). The problem is that we certainly must transfer all the new catalogs, but how would rsync know that those catalog files have to be transferred but not the user relations? Using --size-only would mean that system catalogs whose sizes happen to match after the upgrade wouldn't be transferred and that would certainly lead to a corrupt situation. Andres proposed a helper script which would go through the entire tree on the remote side and set all the timestamps on the remote side to match those on the local side (prior to the pg_upgrade). If all the relfilenodes remained the same and the timestamps on the catalog tables all changed then it might work to do (without using --size-only): stop-cluster set-timestamp-script pg_upgrade rsync new_data_dir - remote:existing_cluster This would mean that any other files which happened to be changed by pg_upgrade beyond the catalog tables would also get copied across. The issue that I see with that is that if the pg_upgrade process does touch anything outside of the system catalogs, then its documented revert mechanism (rename the control file and start the old cluster back up, prior to having started the new cluster) wouldn't be valid. Requiring an extra script which runs around changing timestamps on files is a bit awkward too, though I suppose possible, and then we'd also have to document that this process only works with $version of pg_upgrade that does the preservation of the relfilenodes. I suppose there's also technically a race condition to consider, if the whole thing is scripted and pg_upgrade manages to change an existing file in the same second that the old cluster did then that file wouldn't be recognized by the rsync as having been updated. That's not too hard to address though- just wait a second somewhere in there. Still, I'm not really sure that this approach really gains us much over the approach that Bruce is proposing. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
* Robert Haas (robertmh...@gmail.com) wrote: On Tue, Jan 27, 2015 at 9:36 AM, Stephen Frost sfr...@snowman.net wrote: The example listed works, but only when it's a local rsync: rsync --archive --hard-links --size-only old_dir new_dir remote_dir Perhaps a better example (or additional one) would be with a remote rsync, including clarification of old and new dir, like so: (run in /var/lib/postgresql) rsync --archive --hard-links --size-only \ 9.3/main \ 9.4/main \ server:/var/lib/postgresql/ Note that 9.3/main and 9.4/main are two source directories for rsync to copy over, while server:/var/lib/postgresql/ is a remote destination directory. The above directories match a default Debian/Ubuntu install. My point is that Bruce's patch suggests looking for remote_dir in the rsync documentation, but no such term appears there. Ah, well, perhaps we could simply add a bit of clarification to this: for details on specifying optionremote_dir/ like so: for details on specifying the destination optionremote_dir/ ? On my system, the rsync man page has '[DEST]' in the synopsis, but it doesn't actually go on to specifically define what 'DEST' is, rather referring to it later as 'destination' or 'remote directory'. I'm sure other suggestions would be welcome if they'd help clarify. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On Tue, Jan 27, 2015 at 09:36:58AM -0500, Stephen Frost wrote: The example listed works, but only when it's a local rsync: rsync --archive --hard-links --size-only old_dir new_dir remote_dir Perhaps a better example (or additional one) would be with a remote rsync, including clarification of old and new dir, like so: (run in /var/lib/postgresql) rsync --archive --hard-links --size-only \ 9.3/main \ 9.4/main \ server:/var/lib/postgresql/ Note that 9.3/main and 9.4/main are two source directories for rsync to copy over, while server:/var/lib/postgresql/ is a remote destination directory. The above directories match a default Debian/Ubuntu install. OK, sorry everyone was confused by 'remote_dir'. Does this new patch help? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..4e4fe64 *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** pg_upgrade.exe *** 409,414 --- 409,486 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed +on all the standby servers. Do emphasisnot/ run +applicationinitdb/. If applicationinitdb/ was run, delete +the standby server data directories. Also, install any custom +shared object files on the new standbys that you installed in the +new master cluster. + /para + /step + + step + titleRun applicationrsync//title + + para +From a directory that is above the old and new database cluster +directories, run this for each slave: + + programlisting +rsync --archive --hard-links --size-only old_dir new_dir remote_dir + /programlisting + +where optionold_dir/ and optionnew_dir/ are relative +to the current directory, and optionremote_dir/ is +emphasisabove/ the old and new cluster directories on +the standby server. The old and new relative cluster paths +must match on the master and standby server. Consult the +applicationrsync/ manual page for details on specifying the +remote directory, e.g. literalslavehost:/var/lib/postgresql//. +applicationrsync/ will be fast when option--link/ mode is +used because it will create hard links on the remote server rather +than transfering user data. + /para + /step + + step + titleConfigure log-shipping to standby servers/title + + para +Configure the servers for log shipping. (You do not need to run +functionpg_start_backup()/ and functionpg_stop_backup()/ +or take a file system backup as the slaves are still sychronized +with the master.) + /para + /step + + /procedure + +/step + +step + titleStart the new server/title + + para + The new server and any applicationrsync/'ed standby servers can + now be safely started. + /para +/step + +step titlePost-Upgrade processing/title para *** psql --username postgres --file script.s *** 548,562 /para para -A Log-Shipping Standby Server (xref linkend=warm-standby) cannot -be upgraded because the server must allow writes. The simplest way -is to upgrade the primary and use commandrsync/ to rebuild the -standbys. You can run commandrsync/ while the primary is down, -or as part of a base backup (xref linkend=backup-base-backup) -which overwrites the old standby cluster. - /para - - para If you want to use link mode and you do not want your old cluster to be modified when the new cluster is started, make a copy of the old cluster and upgrade that in link mode. To make a valid copy --- 620,625 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Tue, Jan 27, 2015 at 09:44:51PM -0500, David Steele wrote: On 1/27/15 9:32 PM, Bruce Momjian wrote Now, this isn't actually a problem for the first time that file is backed up- the issue is if that file isn't changed again. rsync won't re-copy it, but that change that rsync missed won't be in the WAL history for the *second* backup that's done (only the first), leading to a case where that file would end up corrupted. Interesting problem, but doesn't rsync use sub-second accuracy? According to my empirical testing on Linux and OSX the answer is no: rsync does not use sub-second accuracy. This seems to be true even on file systems like ext4 that support millisecond mod times, at least it was true on Ubuntu 12.04 running ext4. Even on my laptop there is a full half-second of vulnerability for rsync. Faster systems may have a larger window. OK, bummer. Well, I don't think we ever recommend to run rsync without checksums, but the big problem is that rsync doesn't do checksums by default. :-( pg_upgrade recommends using two rsyncs: To make a valid copy of the old cluster, use commandrsync/ to create a dirty copy of the old cluster while the server is running, then shut down the old server and run commandrsync/ again to update the copy with any changes to make it consistent. You might want to exclude some I am afraid that will not work as it could miss changes, right? When would the default mod-time checking every be safe? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/27/15 9:32 PM, Bruce Momjian wrote Now, this isn't actually a problem for the first time that file is backed up- the issue is if that file isn't changed again. rsync won't re-copy it, but that change that rsync missed won't be in the WAL history for the *second* backup that's done (only the first), leading to a case where that file would end up corrupted. Interesting problem, but doesn't rsync use sub-second accuracy? According to my empirical testing on Linux and OSX the answer is no: rsync does not use sub-second accuracy. This seems to be true even on file systems like ext4 that support millisecond mod times, at least it was true on Ubuntu 12.04 running ext4. Even on my laptop there is a full half-second of vulnerability for rsync. Faster systems may have a larger window. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/23/15 12:40 PM, Stephen Frost wrote: That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not that big a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww. The race condition is a problem for pg_start/stop_backup and friends. In this instance, everything will be shut down when the rsync is running, so there isn't a timestamp race condition to worry about. Yeah, I'm more concerned about people that use rsync to take base backups. Do we need to explicitly advise against that? Is there a way to work around this with a sleep after pg_start_backup to make sure all timestamps must be different? (Admittedly I haven't fully wrapped my head around this yet.) How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell script instead of directly modifying things itself? Or perhaps some custom command file that could then be replayed by pg_upgrade on another server? Of course, that's assuming that replicas are compatible enough with masters for that to work... Yeah, I had suggested that to Bruce also, but it's not clear why that would be any different from an rsync --size-only in the end, presuming everything went according to plan. Yeah, if everything is shut down maybe we're OK. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/26/15 5:11 PM, Jim Nasby wrote: The race condition is a problem for pg_start/stop_backup and friends. In this instance, everything will be shut down when the rsync is running, so there isn't a timestamp race condition to worry about. Yeah, I'm more concerned about people that use rsync to take base backups. Do we need to explicitly advise against that? Is there a way to work around this with a sleep after pg_start_backup to make sure all timestamps must be different? (Admittedly I haven't fully wrapped my head around this yet.) A sleep in pg_start_backup() won't work. The race condition is in rsync if the file is modified in the same second after it is copied. Waiting until the beginning of the next second in pg_start_backup() would actually make a bigger window where the issue can occur. I solved this problem in PgBackRest (an alternative to barman, etc.) by waiting the remainder of the second after the manifest is built before copying. That way, if a file is modified in the second after the manifest is built that later version will still be copied. Any mods after that will be copied in the next backup (as they should be). PgBackRest does not use rsync, tar, etc.) so I was able to code around the issue. The interesting thing about this race condition is that it does not affect the backup where it occurs. It affects the next backup when the modified file does not get copied because the timestamp is the same as the previous backup. Of course using checksums will solve the problem in rsync but that's expensive. Thus my comment earlier that the hot rsync / cold rsync method is not absolutely safe. If you do checksums on the cold rsync then you might as well just use them the first time - you'll have the same downtime either way. I've written tests to show the rsync vulnerability and another to show that this can affect a running database. However, to reproduce it reliably you need to force a checkpoint or have them happening pretty close together. -- - David Steele da...@pgmasters.ne signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
Jim, * Jim Nasby (jim.na...@bluetreble.com) wrote: On 1/23/15 12:40 PM, Stephen Frost wrote: That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not that big a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww. The race condition is a problem for pg_start/stop_backup and friends. In this instance, everything will be shut down when the rsync is running, so there isn't a timestamp race condition to worry about. Yeah, I'm more concerned about people that use rsync to take base backups. Do we need to explicitly advise against that? Is there a way to work around this with a sleep after pg_start_backup to make sure all timestamps must be different? (Admittedly I haven't fully wrapped my head around this yet.) I've thought about it a fair bit actually and I agree that there is some risk to using rsync for *incremental* base backups. That is, you have a setup where you loop with: pg_start_backup rsync - dest pg_stop_backup without using -I, changing what 'dest' is, or making sure it's empty every time. The problem is the 1s-level granularity used on the timestamp. A possible set of operations, all within 1s, is: file changed rsync starts copying the file file changed again (somewhere prior to where rsync is at) rsync finishes the file copy Now, this isn't actually a problem for the first time that file is backed up- the issue is if that file isn't changed again. rsync won't re-copy it, but that change that rsync missed won't be in the WAL history for the *second* backup that's done (only the first), leading to a case where that file would end up corrupted. This is a pretty darn narrow situation and one that I doubt many people will hit, but I do think it's possible. A way to address this would be to grab all timestamps for all files at the start of the backup and re-copy any files whose times are changed after that point (or which were being changed at the time the check was done, or perhaps simply any file which has a timestamp after the starting timestamp of the backup). How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell script instead of directly modifying things itself? Or perhaps some custom command file that could then be replayed by pg_upgrade on another server? Of course, that's assuming that replicas are compatible enough with masters for that to work... Yeah, I had suggested that to Bruce also, but it's not clear why that would be any different from an rsync --size-only in the end, presuming everything went according to plan. Yeah, if everything is shut down maybe we're OK. Regarding this, yes, I think it 'should' work, but it would definitely be good to test it quite a bit before relying on it.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/26/15 5:08 PM, David Steele wrote: I've written tests to show the rsync vulnerability and another to show that this can affect a running database. However, to reproduce it reliably you need to force a checkpoint or have them happening pretty close together. Related to this and Stephen's comment about testing... ISTM it would be very useful to have a published suite of tests for PITR backups, perhaps even utilizing special techniques in Postgres to expose potential failure conditions. Similarly, it'd also be nice to have a suite of tests you could run to validate a backup that you've restored. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote: You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. That still doesn't address the need to use --size-only, it would just mean that you don't need to use -H. If anything the -H part is the aspect which worries me the least about this approach. I can now confirm that it works, just as Stephen said. I was able to upgrade a standby cluster that contained the regression database, and the pg_dump output was perfect. I am attaching doc instruction that I will add to all branches as soon as someone else confirms my results. You will need to use rsync --itemize-changes to see the hard links being created, e.g.: hf+ pgsql/data/base/16415/28188 = pgsql.old/data/base/16384/28188 -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..91f40ce *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** pg_upgrade.exe *** 409,414 --- 409,484 /step step + titleUpgrade any Log-Shipping Standby Servers/title + + para + If you have Log-Shipping Standby Servers (xref + linkend=warm-standby), follow these steps to upgrade them (before + starting any servers): + /para + + procedure + + step + titleInstall the new PostgreSQL binaries on standby servers/title + + para +Make sure the new binaries and support files are installed +on all the standby servers. Do emphasisnot/ run +applicationinitdb/. If applicationinitdb/ was run, delete +the standby server data directories. Also, install any custom +shared object files on the new standbys that you installed in the +new master cluster. + /para + /step + + step + titleRun applicationrsync//title + + para +From a directory that is above the old and new database cluster +directories, run this for each slave: + + programlisting +rsync --archive --hard-links --size-only old_dir new_dir remote_dir + /programlisting + +where optionold_dir/ and optionnew_dir/ are relative to the +current directory, and optionremote_dir/ is emphasisabove/ +the old and new cluster directories on the standby server. The old +and new relative cluster paths must match on the master and standby +server. Consult the applicationrsync/ manual page for details +on specifying optionremote_dir/. applicationrsync/ will +be fast when option--link/ mode is used because it will create +hard links on the remote server rather than transfering user data. + /para + /step + + step + titleConfigure log-shipping to standby servers/title + + para +Configure the servers for log shipping. (You do not need to run +functionpg_start_backup()/ and functionpg_stop_backup()/ +or take a file system backup as the slaves are still sychronized +with the master.) + /para + /step + + /procedure + +/step + +step + titleStart the new server/title + + para + The new server and any applicationrsync/'ed standby servers can + now be safely started. + /para +/step + +step titlePost-Upgrade processing/title para *** psql --username postgres --file script.s *** 548,562 /para para -A Log-Shipping Standby Server (xref linkend=warm-standby) cannot -be upgraded because the server must allow writes. The simplest way -is to upgrade the primary and use commandrsync/ to rebuild the -standbys. You can run commandrsync/ while the primary is down, -or as part of a base backup (xref linkend=backup-base-backup) -which overwrites the old standby cluster. - /para - - para If you want to use link mode and you do not want your old cluster to be modified when the new cluster is started, make a copy of the old cluster and upgrade that in link mode. To make a valid copy --- 618,623 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 7:54 PM, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. What if instead of trying to handle that on the rsync side, we changed pg_upgrade so that it created hardlinks that had the same timestamp as the original file? That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not that big a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww. How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell script instead of directly modifying things itself? Or perhaps some custom command file that could then be replayed by pg_upgrade on another server? Of course, that's assuming that replicas are compatible enough with masters for that to work... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 13:52:54 -0500, Stephen Frost wrote: That wouldn't actually help with what Bruce is trying to do, which is to duplicate the results of the pg_upgrade from the master over to the standby. Well, it'd pretty much obliviate the need to run pg_upgrade on the standby. As there's no renamed files you don't need to muck around with leaving hardlinks in place and such just so that rsync recognizes unchanged files. Uh, pg_upgrade always either creates a hard link tree or copies everything over. If I follow what you're suggesting, pg_upgrade would need a new 'in-place' mode that removes all of the catalog tables from the old cluster and puts the new catalog tables into place and leaves everything else alone. I don't really think I'd want to go there either.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
* Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 14:27:51 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 14:05:10 -0500, Stephen Frost wrote: If I follow what you're suggesting, pg_upgrade would need a new 'in-place' mode that removes all of the catalog tables from the old cluster and puts the new catalog tables into place and leaves everything else alone. No. Except that it'd preserve the relfilenodes (i.e. the filenames of relations) it'd work exactly the same as today. The standby is simply updated by rsyncing the new data directory of the primary to the standby. You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. That still doesn't address the need to use --size-only, it would just mean that you don't need to use -H. If anything the -H part is the aspect which worries me the least about this approach. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
* Jim Nasby (jim.na...@bluetreble.com) wrote: On 1/22/15 7:54 PM, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. What if instead of trying to handle that on the rsync side, we changed pg_upgrade so that it created hardlinks that had the same timestamp as the original file? So, two things, I chatted w/ Bruce and he was less concerned about the lack of being able to match up the timestamps than I was. He has a point though- the catalog tables are going to get copied anyway since they won't be hard links and checking that all the other files match in size and that both the master and the standby are at the same xlog position should give you a pretty good feeling that everything matches up sufficiently. Second, I don't follow what you mean by having pg_upgrade change the hardlinks to have the same timestamp- for starters, the timestamp is in the inode and not the actual hard link (two files hard linked together won't have different timestamps..) and second, the problem isn't on the master side- it's on the standby side. The standby's files will have timestamps different from the master and there really isn't much to be done about that. That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not that big a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww. The race condition is a problem for pg_start/stop_backup and friends. In this instance, everything will be shut down when the rsync is running, so there isn't a timestamp race condition to worry about. How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell script instead of directly modifying things itself? Or perhaps some custom command file that could then be replayed by pg_upgrade on another server? Of course, that's assuming that replicas are compatible enough with masters for that to work... Yeah, I had suggested that to Bruce also, but it's not clear why that would be any different from an rsync --size-only in the end, presuming everything went according to plan. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-22 20:54:47 -0500, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-23 13:52:54 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-22 20:54:47 -0500, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. That's an entirely orthogonal discussion from the original one though, no? Don't think so. That wouldn't actually help with what Bruce is trying to do, which is to duplicate the results of the pg_upgrade from the master over to the standby. Well, it'd pretty much obliviate the need to run pg_upgrade on the standby. As there's no renamed files you don't need to muck around with leaving hardlinks in place and such just so that rsync recognizes unchanged files. Trying to pg_upgrade both the master and the standby, to me at least, seems like an even *worse* approach than trusting rsync with -H and --size-only.. I think running pg_upgrade on the standby is a dangerous folly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-23 14:05:10 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 13:52:54 -0500, Stephen Frost wrote: That wouldn't actually help with what Bruce is trying to do, which is to duplicate the results of the pg_upgrade from the master over to the standby. Well, it'd pretty much obliviate the need to run pg_upgrade on the standby. As there's no renamed files you don't need to muck around with leaving hardlinks in place and such just so that rsync recognizes unchanged files. Uh, pg_upgrade always either creates a hard link tree or copies everything over. Yes. The problem is that the filenames after pg_upgrade aren't the same as before. Which means that a simple rsync call won't be able to save anything because the standby's filenames differ. What you can do is rsync both cluster directories (i.e. the old and the post pg_upgrade ones) and use rsync -H, right? Without transferring both -H won't detect the hardlinks as they need to be in the synced set. That's pretty cumbersome/complicated, and far from cheap. If I follow what you're suggesting, pg_upgrade would need a new 'in-place' mode that removes all of the catalog tables from the old cluster and puts the new catalog tables into place and leaves everything else alone. No. Except that it'd preserve the relfilenodes (i.e. the filenames of relations) it'd work exactly the same as today. The standby is simply updated by rsyncing the new data directory of the primary to the standby. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-22 20:54:47 -0500, Stephen Frost wrote: * Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). I don't understand why that'd be better than simply fixing (yes, that's imo the correct term) pg_upgrade to retain relfilenodes across the upgrade. Afaics there's no conflict risk and it'd make the clusters much more similar, which would be good; independent of rsyncing standbys. That's an entirely orthogonal discussion from the original one though, no? That wouldn't actually help with what Bruce is trying to do, which is to duplicate the results of the pg_upgrade from the master over to the standby. Even if the relfilenodes were the same across the upgrade, I don't think it'd be a good idea to run pg_upgrade on the standby and hope the results match close enough to the master that you can trust updates to the catalog tables on the standby from the master going forward to work.. Trying to pg_upgrade both the master and the standby, to me at least, seems like an even *worse* approach than trusting rsync with -H and --size-only.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
* Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 14:05:10 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 13:52:54 -0500, Stephen Frost wrote: That wouldn't actually help with what Bruce is trying to do, which is to duplicate the results of the pg_upgrade from the master over to the standby. Well, it'd pretty much obliviate the need to run pg_upgrade on the standby. As there's no renamed files you don't need to muck around with leaving hardlinks in place and such just so that rsync recognizes unchanged files. Uh, pg_upgrade always either creates a hard link tree or copies everything over. Yes. The problem is that the filenames after pg_upgrade aren't the same as before. Which means that a simple rsync call won't be able to save anything because the standby's filenames differ. What you can do is rsync both cluster directories (i.e. the old and the post pg_upgrade ones) and use rsync -H, right? Without transferring both -H won't detect the hardlinks as they need to be in the synced set. That's pretty cumbersome/complicated, and far from cheap. The filenames don't need to be the same for rsync -H to work. You specifically do *not* want to independently rsync the old and new clusters- you need to run a single rsync (and one for each tablespace) with -H and then it'll realize that the old cluster on both systems is identical and will just recreate the hard links, and copy the completely new files (the catalog tables). If I follow what you're suggesting, pg_upgrade would need a new 'in-place' mode that removes all of the catalog tables from the old cluster and puts the new catalog tables into place and leaves everything else alone. No. Except that it'd preserve the relfilenodes (i.e. the filenames of relations) it'd work exactly the same as today. The standby is simply updated by rsyncing the new data directory of the primary to the standby. You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. You'd still have to deal with the tablespace directories being renamed also, since we include the major version and catalog build in the directory name.. This whole process really isn't all that complicated in the end.. my_data_dir/old_cluster my_data_dir/new_cluster pg_upgrade rsync -H --size-only my_data_dir/ standby:/path/to/my_data_dir start the clusters remove the old cluster on the master and standby. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-23 14:27:51 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2015-01-23 14:05:10 -0500, Stephen Frost wrote: If I follow what you're suggesting, pg_upgrade would need a new 'in-place' mode that removes all of the catalog tables from the old cluster and puts the new catalog tables into place and leaves everything else alone. No. Except that it'd preserve the relfilenodes (i.e. the filenames of relations) it'd work exactly the same as today. The standby is simply updated by rsyncing the new data directory of the primary to the standby. You'd have to replace the existing data directory on the master to do that, which pg_upgrade was designed specifically to not do, in case things went poorly. Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. You'd still have to deal with the tablespace directories being renamed also, since we include the major version and catalog build in the directory name.. True. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote: Why? Just rsync the new data directory onto the old directory on the standbys. That's fine and simple. That still doesn't address the need to use --size-only, it would just mean that you don't need to use -H. If anything the -H part is the aspect which worries me the least about this approach. It took me a while to understand what Stephen was saying, so let me explain the details so everyone can get on the same page. First, let's look at the downsides of using non-hardlink rsync against a slave cluster, whether we run pg_upgrade on the slave or not: o must preserve db directory and relfilenodes (4 new things for pg_upgrade to preserve) o must checksum files because there is no way to distinguish user tables/indexes (which don't need to be copied) from system tables/indexes (which must be copied so it is in sync with the master) o must use log_wal_hints when the slave is installed so the checksums match So, even if if all the checksums work, it will be slow/expensive. Stephen's idea is quite interesting. You run pg_upgrade on the master, then, before you start the new server, you run rsync with special flags and sync the old _and_ new clusters on the master with just the old cluster on the standby (yeah, odd). Yes, this is very odd, and where I got lost too. First, this only works when pg_upgrade is run in --link mode. What rsync --hard-links --size-only is going to do is record which files have hard links, remember their inode numbers, and cross-reference the hard-linked files. When doing the rsync remote comparisons, the master's old relfilenodes will match the standby's old relfilenodes, and because we are using --size-only, they will be considered identical and not copied, or even checksumed. When it goes to do the standby's new cluster, none of the directories will exist, so they will all be copied along with the system objects (they are small), but the user tables/indexes will be identified as already existing in the slave's old cluster so it will hard-link to those standby's old cluster files. Once rsync is complete, you can delete the old cluster on master and standby. This is effectively duplicating the way pg_upgrade works. What is interesting is that this will work on any version of pg_upgrade, with no modifications, as long as link mode is used. You _cannot_ run initdb on the standby, as this will create system files that would prevent the master's system files from being copied. This is also going to remove your recovery.conf on the standby, and replace your postgresql.conf with the master's, so any modifications you made to the standby will have to be saved and restored in to the new cluster before starting. I plan to run some tests soon to verify this method works, and if so, I can document it in the pg_upgrade manual page. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 8:54 PM, Stephen Frost wrote: The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. I would definitely not feel comfortable using --size-only. In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 That means the rsync hot, then rsync cold method of updating a standby is not *guaranteed* to work unless checksums are used. This may seem like an edge case, but for a small, active database it looks like it could be a real issue. -- - David Steele da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* David Steele (da...@pgmasters.net) wrote: On 1/22/15 8:54 PM, Stephen Frost wrote: The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. I would definitely not feel comfortable using --size-only. Yeah, it also occurs to me that if any of the catalog tables end up being the same size between the master and the replica that they wouldn't get copied and that'd make for one very interesting result, and not a good one. In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 Right, though that isn't really an issue in this specific case- we're talking about post-pg_upgrade but before the upgraded cluster has actually been started, so nothing should be modifying these files. That means the rsync hot, then rsync cold method of updating a standby is not *guaranteed* to work unless checksums are used. This may seem like an edge case, but for a small, active database it looks like it could be a real issue. That's certainly a good point though. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 10:05 PM, Stephen Frost wrote: In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 Right, though that isn't really an issue in this specific case- we're talking about post-pg_upgrade but before the upgraded cluster has actually been started, so nothing should be modifying these files. Indeed. This was really directed more at what Bruce said: I am thinking the fix for standys would be similar to what we recommand for upgrades with link mode using a rsync-created copy, e.g. use rsync while the master is running to create a copy of the standby, then shut down the master and run rsync again. However, at that point, you might as well just take a base backup and be done with it. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 22, 2015 at 10:48:37PM +0200, Heikki Linnakangas wrote: * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. Then the page will be updated without writing a WAL record. Just like in the master, if wal_log_hints is off. wal_log_hints works the same on the master or the standby. [ see below for why this entire idea might not work ] OK, I was confused by your previous no. It means we do update hint bits on read-only slave queries --- we just don't WAL log them. In fact, we can't update hint bits on the standby if we are wal logging them is that right? My text was saying: these differences can be reduced by using a fresh standby and by enabling xref linkend=guc-wal-log-hints. (While varnamewal_log_hints/ transfers hint bits from the primary to standbys, additional hint bits are still set on the standbys by read-only queries.) meaning if you don't run any read-only queries on the standby, the files will be same on master/standby because the hint bits will be the same, and rsync will not copy the files. This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I would really like to get a way to pg_upgrade the standbys but we have never really be able to get a solution. Ideally we would update just the system table files, and if the order of pg_upgrade file renames is exactly the same, everything else would match, but I can't imagine what such an API would look like. Have pg_upgrade spit out a list of files to be copied? In fact, these are the relfilenodes pg_upgrade preserves: * While pg_class.oid and pg_class.relfilenode are initially the same * in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM * FULL. In the new cluster, pg_class.oid and pg_class.relfilenode will * be the same and will match the old pg_class.oid value. Because of * this, old/new pg_class.relfilenode values will not match if CLUSTER, * REINDEX, or VACUUM FULL have been performed in the old cluster. * * We control all assignments of pg_type.oid because these oids are stored * in user composite type values. * * We control all assignments of pg_enum.oid because these oids are stored * in user tables as enum values. * * We control all assignments of pg_authid.oid because these oids are stored * in pg_largeobject_metadata. so if the table/index relfilenodes no longer match the oid on the old cluster, due to CLUSTER, REINDEX, or VACUUM FULL, the file name will not match on the new cluster and rsync will copy the entire file. In fact, rsync is going to copy it to the wrong file name, and delete the right file. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 5:43 PM, Bruce Momjian wrote: This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I don't think so. AIUI if the timestamps are different the very next thing it does is run the checksum (which is expensive). So --checksum is just going to hurt. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. We've previously talked about required steps before an upgrade; perhaps we need a way to force an OID/relfilenode change on the old server prior to upgrade. Or, thinking outside the box here... could this type of stuff be done in postgres itself so we could generate wal that's shipped to standby's? That would allow doing this as part of the formal upgrade process without the need for preliminary steps. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-22 14:20:51 -0500, Bruce Momjian wrote: It is possible to upgrade on pg_upgrade on streaming standby servers by making them master servers, running pg_upgrade on them, then shuting down all servers and using rsync to make the standby servers match the real master. Isn't that a pretty crazy procedure? If you need to shut down all servers anyway, you can just rsync after having run pg_upgrade on the master, no? Rsync won't really transfer less just because you ran a similar thing on the standby. Even if this would allow to avoid some traffic for fsync: There's absolutely no guarantee that the standby's pg_upgrade results in a all that similar data directory. Far from everything in postgres is deterministic - it's easy to hit timing differences that result in noticeable differences. Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 22, 2015 at 06:04:24PM -0600, Jim Nasby wrote: On 1/22/15 5:43 PM, Bruce Momjian wrote: This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I don't think so. AIUI if the timestamps are different the very next thing it does is run the checksum (which is expensive). So --checksum is just going to hurt. Oh, OK, good. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. We've previously talked about required steps before an upgrade; perhaps we need a way to force an OID/relfilenode change on the old server prior to upgrade. Actually, the idea I had forgotten is that we are not rsyncing between old and new clusters here, but between two servers who are both new after running pg_upgrade. Their relfilenodes match their oid, and the oids match, so we should be fine. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: On 2015-01-22 14:20:51 -0500, Bruce Momjian wrote: It is possible to upgrade on pg_upgrade on streaming standby servers by making them master servers, running pg_upgrade on them, then shuting down all servers and using rsync to make the standby servers match the real master. Isn't that a pretty crazy procedure? If you need to shut down all Yes, it is crazy, but so is pg_upgrade. ;-) servers anyway, you can just rsync after having run pg_upgrade on the master, no? Rsync won't really transfer less just because you ran a similar thing on the standby. Uh, yeah, it will, because the files can get renamed as part of the upgrade (relfilenode now matches oid), so by running the upgrade, file names are going to match up. I didn't think rsync could handle renaming of files without recopying the entire file. Even if this would allow to avoid some traffic for fsync: There's absolutely no guarantee that the standby's pg_upgrade results in a all that similar data directory. Far from everything in postgres is deterministic - it's easy to hit timing differences that result in noticeable differences. Right, some non-deterministic things would change, but I thought runnning upgrade on the standby would help. However, now that I think of it, we don't preserver the database directory name and assume dbs will will get the same oid and therefore same database directory name on both, but if you use -j, things are going to happen in random order. Oops. Oh well. Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I am thinking the fix for standys would be similar to what we recommand for upgrades with link mode using a rsync-created copy, e.g. use rsync while the master is running to create a copy of the standby, then shut down the master and run rsync again. However, at that point, you might as well just take a base backup and be done with it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 2:19 PM, Heikki Linnakangas wrote: On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/22/2015 10:34 PM, Jim Nasby wrote: On 1/22/15 2:19 PM, Heikki Linnakangas wrote: On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. Then the page will be updated without writing a WAL record. Just like in the master, if wal_log_hints is off. wal_log_hints works the same on the master or the standby. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers