Re: archive_timeout behaviour when archive_mode is off (was Re: [HACKERS] Too frequent checkpoints ?)
On Fri, Feb 15, 2013 at 3:12 PM, Pavan Deolasee pavan.deola...@gmail.com wrote: (changing subject) On Thu, Feb 14, 2013 at 11:48 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Feb 11, 2013 at 5:46 PM, Pavan Deolasee I also noticed that the WAL file switch happens after archive_timeout seconds irrespective of whether archive_mode is turned ON or not. This happens because we don't check if XLogArchivingActive() in CheckArchiveTimeout() function. It looks wrong to me. +1 to fix this. I've not heard the use case where archive_timeout needs to be used even in not archive mode... Ok, I will write a patch to fix this. I wonder if this is worth backpatching though. The code is like that for a long time and the fact that we haven't heard any complaints about it, may be its not worth fixing in the stable branches. But I wonder if anyone else thinks otherwise. On second thought, there might be the use case. Imagine the case where you disable archive_mode and instead use pg_receivexlog for WAL archiving purpose. WAL records are streamed from the server to pg_receivexlog in realtime, so basically archive_timeout is not required to be set in this case. But you might need to periodically verify the correctness of the connection between the server and pg_receivexlog. You can do that by checking whether new WAL file has been streamed since last check. But if there is no activity in the server, you cannot use that verification method. In this case, archive_timeout is useful even when archive_mode is not enabled. This is very narrow case, but as you said, we haven't heard any complaints about it, so I'm inclined not to fix it Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
On Mon, Feb 11, 2013 at 5:46 PM, Pavan Deolasee pavan.deola...@gmail.com wrote: On Fri, Feb 8, 2013 at 10:25 PM, Fujii Masao masao.fu...@gmail.com wrote: . BTW, the cause of the problem is that the following sequences happens. 1. archive_timeout switches WAL file because checkpoint WAL record has has been written since last switch Thank you for explaining that. I also noticed that the WAL file switch happens after archive_timeout seconds irrespective of whether archive_mode is turned ON or not. This happens because we don't check if XLogArchivingActive() in CheckArchiveTimeout() function. It looks wrong to me. +1 to fix this. I've not heard the use case where archive_timeout needs to be used even in not archive mode... Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
archive_timeout behaviour when archive_mode is off (was Re: [HACKERS] Too frequent checkpoints ?)
(changing subject) On Thu, Feb 14, 2013 at 11:48 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Feb 11, 2013 at 5:46 PM, Pavan Deolasee I also noticed that the WAL file switch happens after archive_timeout seconds irrespective of whether archive_mode is turned ON or not. This happens because we don't check if XLogArchivingActive() in CheckArchiveTimeout() function. It looks wrong to me. +1 to fix this. I've not heard the use case where archive_timeout needs to be used even in not archive mode... Ok, I will write a patch to fix this. I wonder if this is worth backpatching though. The code is like that for a long time and the fact that we haven't heard any complaints about it, may be its not worth fixing in the stable branches. But I wonder if anyone else thinks otherwise. I think we should also update the documentation to be clear that archive_timeout and archive_command are used only when archive_mode is turned on. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
On Fri, Feb 8, 2013 at 10:25 PM, Fujii Masao masao.fu...@gmail.com wrote: . BTW, the cause of the problem is that the following sequences happens. 1. archive_timeout switches WAL file because checkpoint WAL record has has been written since last switch Thank you for explaining that. I also noticed that the WAL file switch happens after archive_timeout seconds irrespective of whether archive_mode is turned ON or not. This happens because we don't check if XLogArchivingActive() in CheckArchiveTimeout() function. It looks wrong to me. But if it's that way for a good reason, shouldn't we at least document this behaviour ? Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Too frequent checkpoints ?
I was looking at the vacuum/visibility bug that Jeff Janes reported and brought up the server with the data directory he has shared. With his configuration, # - Checkpoints - checkpoint_segments = 1 # in logfile segments, min 1, 16MB each checkpoint_timeout = 30s# range 30s-1h # - Archiving - archive_mode = on # allows archiving to be done # (change requires restart) archive_command = 'echo archive_command %p %f `date`' # command to use to archive a logfile segment # placeholders: %p = path of file to archive # %f = file name only # e.g. 'test ! -f /mnt/server/archivedir/%f cp %p /mnt/server/archivedir/%f' archive_timeout = 30# force a logfile segment switch after this # number of seconds; 0 disables I see the following messages coming on the console: 3092 2013-02-08 02:29:01.239 PST:LOG: checkpoints are occurring too frequently (0 seconds apart) 3092 2013-02-08 02:29:01.239 PST:HINT: Consider increasing the configuration parameter checkpoint_segments. 3092 2013-02-08 02:29:01.239 PST:LOG: checkpoint starting: xlog 3092 2013-02-08 02:29:01.246 PST:LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.007 s; sync files=0, longest=0.000 s, average=0.000 s archive_command pg_xlog/0001005700F5 0001005700F5 Fri Feb 8 15:59:01 IST 2013 3092 2013-02-08 02:29:31.278 PST:LOG: checkpoints are occurring too frequently (0 seconds apart) 3092 2013-02-08 02:29:31.278 PST:HINT: Consider increasing the configuration parameter checkpoint_segments. 3092 2013-02-08 02:29:31.278 PST:LOG: checkpoint starting: xlog 3092 2013-02-08 02:29:31.285 PST:LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.006 s; sync files=0, longest=0.000 s, average=0.000 s archive_command pg_xlog/0001005700F6 0001005700F6 Fri Feb 8 15:59:31 IST 2013 3092 2013-02-08 02:30:01.316 PST:LOG: checkpoints are occurring too frequently (0 seconds apart) 3092 2013-02-08 02:30:01.316 PST:HINT: Consider increasing the configuration parameter checkpoint_segments. 3092 2013-02-08 02:30:01.316 PST:LOG: checkpoint starting: xlog archive_command pg_xlog/0001005700F7 0001005700F7 Fri Feb 8 16:00:01 IST 2013 3092 2013-02-08 02:30:01.323 PST:LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.006 s; sync files=0, longest=0.000 s, average=0.000 s 3092 2013-02-08 02:30:31.327 PST:LOG: checkpoints are occurring too frequently (0 seconds apart) 3092 2013-02-08 02:30:31.327 PST:HINT: Consider increasing the configuration parameter checkpoint_segments. 3092 2013-02-08 02:30:31.327 PST:LOG: checkpoint starting: xlog archive_command pg_xlog/0001005700F8 0001005700F8 Fri Feb 8 16:00:31 IST 2013 3092 2013-02-08 02:30:31.334 PST:LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.007 s; sync files=0, longest=0.000 s, average=0.000 s I wonder if this is all expected. The database is getting ZERO activity. There are no connections open at this time. The checkpoints are happening at every 30 seconds and new WAL files are being created, AFAIK because the old ones are getting archived. Can't we be smart about not archiving new files if we did not generate any new WAL since the last archive ? Also, the log says checkpoints are occurring too frequently (0 seconds apart). But that looks wrong too. Checkpoints are really happening at 30 seconds apart and not 0 as the log message claims. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
On 02/08/2013 02:37 AM, Pavan Deolasee wrote: I wonder if this is all expected. The database is getting ZERO activity. There are no connections open at this time. The checkpoints are happening at every 30 seconds and new WAL files are being created, AFAIK because the old ones are getting archived. Can't we be smart about not archiving new files if we did not generate any new WAL since the last archive ? It is because you have it set up so that the longest time you can go without checkpoints is 30 seconds. So no matter what, if you have 1000 checkpoint segments you are still going to checkpoint ever 30 seconds. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
On Fri, Feb 8, 2013 at 7:37 PM, Pavan Deolasee pavan.deola...@gmail.com wrote: I wonder if this is all expected. The database is getting ZERO activity. There are no connections open at this time. The checkpoints are happening at every 30 seconds and new WAL files are being created, AFAIK because the old ones are getting archived. Can't we be smart about not archiving new files if we did not generate any new WAL since the last archive ? AFAIR this problem has been discussed several times before, but has not been fixed yet. BTW, the cause of the problem is that the following sequences happens. 1. archive_timeout switches WAL file because checkpoint WAL record has has been written since last switch 2. Checkpoint occurs and writes its WAL record because new WAL file was generated since last checkpoint, i.e., checkpoint thinks that there is at least some activity since last checkpoint. 3. Back to #1 after archive_timeout passed through. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
On Fri, Feb 8, 2013 at 2:37 AM, Pavan Deolasee pavan.deola...@gmail.com wrote: I was looking at the vacuum/visibility bug that Jeff Janes reported and brought up the server with the data directory he has shared. With his configuration, 3092 2013-02-08 02:30:31.327 PST:LOG: checkpoints are occurring too frequently (0 seconds apart) 3092 2013-02-08 02:30:31.327 PST:HINT: Consider increasing the configuration parameter checkpoint_segments. ... I wonder if this is all expected. The database is getting ZERO activity. There are no connections open at this time. The checkpoints are happening at every 30 seconds and new WAL files are being created, AFAIK because the old ones are getting archived. Can't we be smart about not archiving new files if we did not generate any new WAL since the last archive ? Yeah, this is a known issue. We skip two consecutive checkpoints (forced by checkpoint_timeout) with no intervening activity, and we skip two consecutive log switches (forced by archive_timeout) with no intervening activity, but when you put them both together neither one gets skipped because each one counts as intervening activity from the other ones perspective. I think that at one point this was considered desirable, as the arrival of log files in the archive, even if they had no real content, was viewed as proof of life. You can set up archive_command to send you an email if an archive fails, but the same thing that prevents the archive might prevent the email from arriving. Also, the log says checkpoints are occurring too frequently (0 seconds apart). But that looks wrong too. Checkpoints are really happening at 30 seconds apart and not 0 as the log message claims. I noticed that as well, and I think it might be a regression. I'll run a bisection on it if I get a chance. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too frequent checkpoints ?
Also, the log says checkpoints are occurring too frequently (0 seconds apart). But that looks wrong too. Checkpoints are really happening at 30 seconds apart and not 0 as the log message claims. I noticed that as well, and I think it might be a regression. I'll run a bisection on it if I get a chance. Looking further into it, seems there are two checkpoints happening almost immediately. The first happens because of timeout and the second happens immediately because of checkpoint segments (set to 1 in this case) Given that no one has bothered to fix these issues, may be they are non issues after all. Thanks, Pavan Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers