Re: [GENERAL] pg_basebackup, requested WAL has already been removed
Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
That's a good point. Then i dunno, perhaps it is a bug, but I'd be surprised if this wasn't working, as its not really a corner case that could be missed in testing, as long as all the options were exercised. Hopefully someone else can weigh in. On Fri, May 10, 2013 at 10:00 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
On Fri, May 10, 2013 at 7:00 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. It's not a bug in the software - this will happen if the background stream in pg_basebackup cannot keep up, and that's normal. It works the same way as a regular standby in that if it's unable to keep up it will eventually fall so far behind that it can't recover. It may definitely be something that needs to be cleared up in the documentation though. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general