Re: [HACKERS] log messages for archive recovery progress
On Wed, Jan 11, 2012 at 9:01 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, Jan 11, 2012 at 1:54 PM, Satoshi Nagayasu sn...@uptime.jp wrote: However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. Agreed. I can change restored to using, so we have two message types LOG: restored log file 00080047 from archive LOG: using pre-existing log file 00080047 from pg_xlog using seems pretty fuzzy to me. replaying? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
On Thu, Jan 12, 2012 at 2:13 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 11, 2012 at 9:01 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, Jan 11, 2012 at 1:54 PM, Satoshi Nagayasu sn...@uptime.jp wrote: However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. Agreed. I can change restored to using, so we have two message types LOG: restored log file 00080047 from archive LOG: using pre-existing log file 00080047 from pg_xlog using seems pretty fuzzy to me. replaying? That was my first thought, but the message relates to which file has been selected, and how. Once it has been selected it will be replayed. The idea is to have the two messages look similar. The original message was restored log file... and says nothing about replaying. We could change the old message (ugh! backwards compatibility alert) LOG: replaying log file 00080047 after restore from archive LOG: replaying log file 00080047 already in pg_xlog which doesn't sound much stronger to me... not sure. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
On Thu, Jan 12, 2012 at 10:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, Jan 12, 2012 at 2:13 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 11, 2012 at 9:01 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, Jan 11, 2012 at 1:54 PM, Satoshi Nagayasu sn...@uptime.jp wrote: However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. Agreed. I can change restored to using, so we have two message types LOG: restored log file 00080047 from archive LOG: using pre-existing log file 00080047 from pg_xlog using seems pretty fuzzy to me. replaying? That was my first thought, but the message relates to which file has been selected, and how. Once it has been selected it will be replayed. The idea is to have the two messages look similar. The original message was restored log file... and says nothing about replaying. We could change the old message (ugh! backwards compatibility alert) LOG: replaying log file 00080047 after restore from archive LOG: replaying log file 00080047 already in pg_xlog which doesn't sound much stronger to me... not sure. Hmm, I don't know. But that phrasing does at least have the advantage of being clearly parallel, which I like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
2012/01/13 0:13, Robert Haas wrote: On Thu, Jan 12, 2012 at 10:04 AM, Simon Riggssi...@2ndquadrant.com wrote: On Thu, Jan 12, 2012 at 2:13 PM, Robert Haasrobertmh...@gmail.com wrote: On Wed, Jan 11, 2012 at 9:01 AM, Simon Riggssi...@2ndquadrant.com wrote: On Wed, Jan 11, 2012 at 1:54 PM, Satoshi Nagayasusn...@uptime.jp wrote: However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. Agreed. I can change restored to using, so we have two message types LOG: restored log file 00080047 from archive LOG: using pre-existing log file 00080047 from pg_xlog using seems pretty fuzzy to me. replaying? That was my first thought, but the message relates to which file has been selected, and how. Once it has been selected it will be replayed. The idea is to have the two messages look similar. The original message was restored log file... and says nothing about replaying. We could change the old message (ugh! backwards compatibility alert) LOG: replaying log file 00080047 after restore from archive LOG: replaying log file 00080047 already in pg_xlog which doesn't sound much stronger to me... not sure. Hmm, I don't know. But that phrasing does at least have the advantage of being clearly parallel, which I like. It seems difficult to keep backward compatibility. :) Anyway, how about this one? If we have 47 in archive, and 48 in pg_xlog, (1) LOG: restored log file 00080047 from archive (2) LOG: replaying log file 00080047 (3) LOG: could not restore file 00080048 from archive (4) LOG: replaying log file 00080048 In this case, (4) replying after (3) could not restore from archive would means that it has started replaying a WAL from pg_xlog. And if we have 47 in archive, and do not have 48 in pg_xlog, (5) LOG: restored log file 00080047 from archive (6) LOG: replaying log file 00080047 (7) LOG: could not restore file 00080048 from archive (8) LOG: could not replay file 00080048 In this case, (8) could not replay file after (7) could not restore from archive would means that 48 is not found both archive and pg_xlog, so that the latest WAL would gone. I just got another option in my mind. Telling both two numbers of WAL files, from archive and pg_xlog directory, those have been applied during archive recovery would make sense? How about this one? LOG: XXX file(s) from archive, YYY file(s) from pg_xlog successfully applied. Thanks, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
On Fri, Jan 13, 2012 at 12:58 AM, Satoshi Nagayasu sn...@uptime.jp wrote: Anyway, how about this one? If we have 47 in archive, and 48 in pg_xlog, (1) LOG: restored log file 00080047 from archive (2) LOG: replaying log file 00080047 (3) LOG: could not restore file 00080048 from archive (4) LOG: replaying log file 00080048 In this case, (4) replying after (3) could not restore from archive would means that it has started replaying a WAL from pg_xlog. Looks confusing. In this way, you always need to look at more than one messages to understand the meaning of one message. Which seems not good design. What if lots of log files exist in pg_xlog and you got the following message? To figure out where the log file 98 comes from, you need to find (3) from lots of log messages. LOG: replaying log file 00080098 What about the following simpler way? (1) LOG: restored log file 00080047 from archive (2) LOG: replaying log file 00080047 from archive (4) LOG: replaying log file 00080048 from pg_xlog In file-base log-shipping with standby_mode=on case, restoring from archive is repeated periodically, which would flood the log file with the messages like (3). So I don't like emitting the message like (3) with LOG level. I just got another option in my mind. Telling both two numbers of WAL files, from archive and pg_xlog directory, those have been applied during archive recovery would make sense? How about this one? LOG: XXX file(s) from archive, YYY file(s) from pg_xlog successfully applied. What's the use case of this message? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
2012/1/11 Euler Taveira de Oliveira eu...@timbira.com: On 08-01-2012 11:59, Satoshi Nagayasu / Uptime Technologies, LLC. wrote: [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080046 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080046 [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080047 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080047 cp: cannot stat `/backups/archlog/00080048': No such file or directory [2011-12-08 15:14:37 JST] 16758: LOG: could not restore file 00080048 from archive [2011-12-08 15:14:37 JST] 16758: LOG: attempting to look into pg_xlog [2011-12-08 15:14:37 JST] 16758: LOG: recoverying 00080048 What about just 'restored log file 00080048 from pg_xlog' instead of the last two messages? If you can't read from pg_xlog emit 'could not restore file 00080048 from pg_xlog'. Yes, simple is better. We already have a message if the file is not present, we just need one if we switch to using pg_xlog. Please look at this. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8e65962..055ae6d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -2817,6 +2817,16 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, xlogfname); set_ps_display(activitymsg, false); + /* + * If we're in archive recovery and the archive runs dry, + * we switch to reading from the pg_xlog directly. In that case + * we want to output a message that shows what just happened. + */ + if (InArchiveRecovery source == XLOG_FROM_PG_XLOG) + ereport(LOG, + (errmsg(restored log file \%s\ from pg_xlog, + xlogfname))); + /* Track source of data in assorted state variables */ readSource = source; XLogReceiptSource = source; @@ -2867,10 +2877,7 @@ XLogFileReadAnyTLI(uint32 log, uint32 seg, int emode, int sources) { fd = XLogFileRead(log, seg, emode, tli, XLOG_FROM_ARCHIVE, true); if (fd != -1) - { -elog(DEBUG1, got WAL segment from archive); return fd; - } } if (sources XLOG_FROM_PG_XLOG) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
2012/01/11 19:56, Simon Riggs wrote: 2012/1/11 Euler Taveira de Oliveiraeu...@timbira.com: On 08-01-2012 11:59, Satoshi Nagayasu / Uptime Technologies, LLC. wrote: [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080046 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080046 [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080047 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080047 cp: cannot stat `/backups/archlog/00080048': No such file or directory [2011-12-08 15:14:37 JST] 16758: LOG: could not restore file 00080048 from archive [2011-12-08 15:14:37 JST] 16758: LOG: attempting to look into pg_xlog [2011-12-08 15:14:37 JST] 16758: LOG: recoverying 00080048 What about just 'restored log file 00080048 from pg_xlog' instead of the last two messages? If you can't read from pg_xlog emit 'could not restore file 00080048 from pg_xlog'. Yes, simple is better. We already have a message if the file is not present, we just need one if we switch to using pg_xlog. Please look at this. Thanks for a patch. I agree that simple is better. However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. I'll try to revise my proposed log messages again. Thanks, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
On Wed, Jan 11, 2012 at 1:54 PM, Satoshi Nagayasu sn...@uptime.jp wrote: However, I'm a bit afraid that it will confuse DBA if we use restored under the pg_xlog replay context, because we have already used restored that means a WAL file as successfully copied (not replayed) from archive directory into pg_xlog directory under the archive recovery context. So, to determine the status of copying WAL files from archive directory, I think we can use restored, or could not restore on failure. And to determine the status of replaying WAL files in pg_xlog directory (even if a WAL is copied from archive), we have to use recover or replay. Agreed. I can change restored to using, so we have two message types LOG: restored log file 00080047 from archive LOG: using pre-existing log file 00080047 from pg_xlog -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log messages for archive recovery progress
On 08-01-2012 11:59, Satoshi Nagayasu / Uptime Technologies, LLC. wrote: [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080046 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080046 [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080047 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080047 cp: cannot stat `/backups/archlog/00080048': No such file or directory [2011-12-08 15:14:37 JST] 16758: LOG: could not restore file 00080048 from archive [2011-12-08 15:14:37 JST] 16758: LOG: attempting to look into pg_xlog [2011-12-08 15:14:37 JST] 16758: LOG: recoverying 00080048 What about just 'restored log file 00080048 from pg_xlog' instead of the last two messages? If you can't read from pg_xlog emit 'could not restore file 00080048 from pg_xlog'. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] log messages for archive recovery progress
Hi, When I look into archive recovery deeply as DBA point of view, I found that it's difficult to know (1) when the recovery process switched reading WAL segments files from archive directory to pg_xlog directory, and (2) whether it succeeded applying the latest WAL segments in the pg_xlog directory. For example, when I had 47 WAL segment in the archive directory and 48 WAL segment in the pg_xlog directory, PostgreSQL said: [2011-12-08 05:59:03 JST] 9003: LOG: restored log file 00080046 from archive [2011-12-08 05:59:03 JST] 9003: LOG: restored log file 00080047 from archive cp: cannot stat `/backups/archlog/00080048': No such file or directory [2011-12-08 05:59:03 JST] 9003: LOG: record with zero length at 0/489F8B74 [2011-12-08 05:59:03 JST] 9003: LOG: redo done at 0/489F8B38 [2011-12-08 05:59:03 JST] 9003: LOG: last completed transaction was at log time 2011-12-08 05:52:01.507063+09 cp: cannot stat `/backups/archlog/0009.history': No such file or directory [2011-12-08 05:59:03 JST] 9003: LOG: selected new timeline ID: 9 [2011-12-08 05:59:03 JST] 9003: LOG: restored log file 0008.history from archive [2011-12-08 05:59:04 JST] 9003: LOG: archive recovery complete I felt it's difficult to determine whether PostgreSQL applied 48 WAL segment in the pg_xlog, or not. So, I want to propose new log messages to show archive log progress as reading WAL segments. With applying my tiny modification, the recovery process reports its progress for each WAL segment file, and also tells switching reading archive directory to pg_xlog directory explicitly as shown below. [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080046 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080046 [2011-12-08 15:14:36 JST] 16758: LOG: restored log file 00080047 from archive [2011-12-08 15:14:36 JST] 16758: LOG: recoverying 00080047 cp: cannot stat `/backups/archlog/00080048': No such file or directory [2011-12-08 15:14:37 JST] 16758: LOG: could not restore file 00080048 from archive [2011-12-08 15:14:37 JST] 16758: LOG: attempting to look into pg_xlog [2011-12-08 15:14:37 JST] 16758: LOG: recoverying 00080048 [2011-12-08 15:14:37 JST] 16758: LOG: record with zero length at 0/489F8B74 [2011-12-08 15:14:37 JST] 16758: LOG: redo done at 0/489F8B38 [2011-12-08 15:14:37 JST] 16758: LOG: last completed transaction was at log time 2011-12-08 05:52:01.507063+09 cp: cannot stat `/backups/archlog/0009.history': No such file or directory [2011-12-08 15:14:37 JST] 16758: LOG: selected new timeline ID: 9 [2011-12-08 15:14:37 JST] 16758: LOG: restored log file 0008.history from archive [2011-12-08 15:14:38 JST] 16758: LOG: archive recovery complete Do you think this is useful? Thanks, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers