The following bug has been logged on the website: Bug reference: 6725 Logged by: Maxim Boguk Email address: maxim.bo...@gmail.com PostgreSQL version: 9.0.7 Operating system: Debian Linux Description:
Hi, On one of my production databases hot-standby server start repeatable crash after restart. There is log: 2012-07-10 03:30:20 MSK 12034 postgres@postgres from [local] [vxid:2/0 txid:0] [CHECKPOINT] LOG: duration: 100.262 ms statement: checkpoint; 2012-07-10 03:30:24 MSK 12034 postgres@postgres from [local] [vxid:2/0 txid:0] [CHECKPOINT] LOG: duration: 100.207 ms statement: checkpoint; 2012-07-10 03:30:41 MSK 24618 @ from [vxid: txid:0] [] LOG: received fast shutdown request 2012-07-10 03:30:41 MSK 24618 @ from [vxid: txid:0] [] LOG: aborting any active transactions 2012-07-10 03:30:41 MSK 25312 @ from [vxid: txid:0] [] FATAL: terminating walreceiver process due to administrator command 2012-07-10 03:30:41 MSK 24646 @ from [vxid: txid:0] [] LOG: shutting down 2012-07-10 03:30:41 MSK 24646 @ from [vxid: txid:0] [] LOG: database system is shut down 2012-07-10 03:30:45 MSK 12202 @ from [vxid: txid:0] [] LOG: database system was shut down in recovery at 2012-07-10 03:30:41 MSK 2012-07-10 03:30:45 MSK 12202 @ from [vxid: txid:0] [] LOG: entering standby mode 2012-07-10 03:30:45 MSK 12202 @ from [vxid:1/0 txid:0] [] LOG: redo starts at 127C/51EF6798 2012-07-10 03:31:35 MSK 12202 @ from [vxid:1/0 txid:0] [] PANIC: GIN metapage disappeared 2012-07-10 03:31:35 MSK 12202 @ from [vxid:1/0 txid:0] [] CONTEXT: xlog redo Update metapage, node: 1663/62450/27663577 blkno: 2164 2012-07-10 03:31:35 MSK 12189 @ from [vxid: txid:0] [] LOG: startup process (PID 12202) was terminated by signal 6: Aborted 2012-07-10 03:31:35 MSK 12189 @ from [vxid: txid:0] [] LOG: terminating any other active server processes 2012-07-10 03:33:37 MSK 13071 @ from [vxid: txid:0] [] LOG: database system was interrupted while in recovery at log time 2012-07-10 02:17:21 MSK 2012-07-10 03:33:37 MSK 13071 @ from [vxid: txid:0] [] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2012-07-10 03:33:37 MSK 13071 @ from [vxid: txid:0] [] LOG: entering standby mode 2012-07-10 03:33:37 MSK 13071 @ from [vxid:1/0 txid:0] [] LOG: redo starts at 127C/51EF6798 2012-07-10 03:34:15 MSK 13071 @ from [vxid:1/0 txid:0] [] PANIC: GIN metapage disappeared 2012-07-10 03:34:15 MSK 13071 @ from [vxid:1/0 txid:0] [] CONTEXT: xlog redo Update metapage, node: 1663/62450/27663577 blkno: 2164 2012-07-10 03:34:15 MSK 13065 @ from [vxid: txid:0] [] LOG: startup process (PID 13071) was terminated by signal 6: Aborted 2012-07-10 03:34:15 MSK 13065 @ from [vxid: txid:0] [] LOG: terminating any other active server processes and so on. I almost sure there are no disk corruption or such errors. Now some additional information from the master server: 1)1663/62450/27663577 - corresponding to the pg_default tablespace and wsfilm database. However, no entries with relfilenode=27663577 were found in pg_class. In the same time: ls -la ./9.0/main/base/62450/27663577 -rw------- 1 postgres postgres 0 Июл 10 02:29 ./9.0/main/base/62450/27663577 In wsfilm db only one GIN index exists: wsfilm=# SELECT * from pg_indexes where indexdef like '%gin%'; -[ RECORD 1 ]--------------------------------------------------- schemaname | public tablename | movie indexname | movie_fti_idx tablespace | indexdef | CREATE INDEX movie_fti_idx ON movie USING gin (fti) But it has different relfilenode: wsfilm=# SELECT relfilenode from pg_class where relname='movie_fti_idx'; -[ RECORD 1 ]--------- relfilenode | 27677714 2)Just few minutes before that index was rebuild on the master database: 2012-07-10 02:29:50 MSK 1480 postgres@wsfilm from [local] [vxid:33/3324338 txid:0] [idle] LOG: statement: CREATE INDEX CONCURRENTLY pgcompactor_tmp1436 ON movie USING gin (fti); 2012-07-10 02:29:52 MSK 1480 postgres@wsfilm from [local] [vxid:33/3324343 txid:0] [idle] LOG: statement: BEGIN; DROP INDEX public.movie_fti_idx; ALTER INDEX public.pgcompactor_tmp1436 RENAME TO movie_fti_idx; END; My theory that the wal replay trying replay GIN index information on already dropped index. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs