[PERFORM] slave wal is ahead of master
Hello, we have 3 servers with postgresql 9.3.3. One is master and two slaves. We run synchronous_replication and fsync, synchronous_commit and full_page_writes are on. Suddenly master hang up with hardware failure, it is a strange bug in iLo which we investigate with HP. Before master was rebooted, i ran ps aux on slave postgres: wal receiver process streaming 12/F1031DF8 Last messages in slaves logs was 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed transaction was at log time 2014-03-19 02:41:28.886869+00 and then there was silence, because master hang. Then master was rebooted and slave wrote in log 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout, 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8, 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 GMT,,0,FATAL,XX000,could not connect to the primary server: could not connect to server: No route to host Is the server running on host 10.162.2.50 and accepting TCP/IP connections on port 5432? , Then master finally came back, slave wrote 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL: the database system is starting up , 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR: requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8 , last message was repeated several times and then this happened 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator command, 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,” and it just repeats forever. Meanwhile on master 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 GMT,,0,LOG,0,database system was not properly shut down; automatic recovery in progress, 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo done at 12/F0FFFC38, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 05:02:29.273138+00, 2014-03-19 15:39:47.115 GMT,,,7115,,5329ba32.1bcb,6,,2014-03-19 15:39:30 GMT,,0,LOG,0,checkpoint starting: end-of-recovery immediate, 2014-03-19 15:40:16.466 GMT,replicator,,7986,10.162.2.52:44336,5329ba5e.1f32,1,idle,2014-03-19 15:40:14 GMT,2/0,0,ERROR,XX000,requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8,walreceiver So, all two slaves are disconnected from master, which somehow is past his slaves. I decided to promote one of the slaves, so we can have some snapshot of the data. relevant logs from this are 2014-03-19 16:50:43.118 GMT,,,,,5329cae3.115c,3,,2014-03-19 16:50:43 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 16:50:50.028 GMT,dboperator,postgres,4452,[local],5329caea.1164,1,,2014-03-19 16:50:50 GMT,,0,FATAL,57P03,the database system is starting up, 2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,4,,2014-03-19 16:50:43 GMT,,0,LOG,0,invalid contrecord length 5736 at 12/F0FFFC80, 2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,5,,2014-03-19 16:50:43 GMT,,0,LOG,0,redo done at 12/F0FFFC38,”” It is interesting that redo done at 12/F0FFFC38 both on master and promoted slave. The main question is there is actual latest data, and how is it possible that master is behind his slave in synchronous replication. Thanks for the help. -- Sent via
Re: [PERFORM] slave wal is ahead of master
fsync ... taking more knowledge around this will shed some light to understand this problem slave ahead of master there was silence, because master hang. ... replication halted here, master holds the latest copy which is missing at both the slaves I decided to promote one of the slaves ... only 2 slaves are left, and one among them is going to be the master now, in master/slave nomenclature the data from master is considered as the valid one from this point onward master is behind his slave ... you mentioned that the original master comes up in the mean time one of the slave was already a master On Wed, Mar 19, 2014 at 10:40 PM, Evgeniy Shishkin itparan...@gmail.comwrote: Hello, we have 3 servers with postgresql 9.3.3. One is master and two slaves. We run synchronous_replication and fsync, synchronous_commit and full_page_writes are on. Suddenly master hang up with hardware failure, it is a strange bug in iLo which we investigate with HP. Before master was rebooted, i ran ps aux on slave postgres: wal receiver process streaming 12/F1031DF8 Last messages in slaves logs was 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed transaction was at log time 2014-03-19 02:41:28.886869+00 and then there was silence, because master hang. Then master was rebooted and slave wrote in log 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout, 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8, 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 GMT,,0,FATAL,XX000,could not connect to the primary server: could not connect to server: No route to host Is the server running on host 10.162.2.50 and accepting TCP/IP connections on port 5432? , Then master finally came back, slave wrote 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL: the database system is starting up , 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR: requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8 , last message was repeated several times and then this happened 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator command, 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, and it just repeats forever. Meanwhile on master 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 GMT,,0,LOG,0,database system was not properly shut down; automatic recovery in progress, 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo done at 12/F0FFFC38, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 05:02:29.273138+00, 2014-03-19 15:39:47.115 GMT,,,7115,,5329ba32.1bcb,6,,2014-03-19 15:39:30 GMT,,0,LOG,0,checkpoint starting: end-of-recovery immediate, 2014-03-19 15:40:16.466 GMT,replicator,,7986,10.162.2.52:44336,5329ba5e.1f32,1,idle,2014-03-19 15:40:14 GMT,2/0,0,ERROR,XX000,requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8,walreceiver So, all two slaves are disconnected from master, which somehow is past his slaves. I decided to promote one of the slaves, so we can have some snapshot of the data. relevant logs from this are 2014-03-19 16:50:43.118 GMT,,,,,5329cae3.115c,3,,2014-03-19 16:50:43 GMT,,0,LOG,0,redo starts
Re: [PERFORM] slave wal is ahead of master
On 19 Mar 2014, at 21:11, Sethu Prasad sethuprasad...@gmail.com wrote: fsync ... taking more knowledge around this will shed some light to understand this problem slave ahead of master there was silence, because master hang. ... replication halted here, master holds the latest copy which is missing at both the slaves Master hang in very strange way. I could ssh to it, and see dmesg, but not any other command. Also tcp connection was alive to slaves. So we can’t say that slaves did not receive data. I decided to promote one of the slaves ... only 2 slaves are left, and one among them is going to be the master now, in master/slave nomenclature the data from master is considered as the valid one from this point onward There was no any failover procedure. Slaves was slaves. master is behind his slave ... you mentioned that the original master comes up in the mean time one of the slave was already a master No, after we rebooted master slaves did not reconnected to it. Later i stopped replication on one of the slaves to preserve data its state. So the main question is, under which circumstances slaves can not reconnect to master with error that master is behind. With fsync on, and synchronous* on. On Wed, Mar 19, 2014 at 10:40 PM, Evgeniy Shishkin itparan...@gmail.com wrote: Hello, we have 3 servers with postgresql 9.3.3. One is master and two slaves. We run synchronous_replication and fsync, synchronous_commit and full_page_writes are on. Suddenly master hang up with hardware failure, it is a strange bug in iLo which we investigate with HP. Before master was rebooted, i ran ps aux on slave postgres: wal receiver process streaming 12/F1031DF8 Last messages in slaves logs was 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed transaction was at log time 2014-03-19 02:41:28.886869+00 and then there was silence, because master hang. Then master was rebooted and slave wrote in log 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout, 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8, 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 GMT,,0,FATAL,XX000,could not connect to the primary server: could not connect to server: No route to host Is the server running on host 10.162.2.50 and accepting TCP/IP connections on port 5432? , Then master finally came back, slave wrote 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL: the database system is starting up , 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR: requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8 , last message was repeated several times and then this happened 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator command, 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,” and it just repeats forever. Meanwhile on master 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 GMT,,0,LOG,0,database system was not properly shut down; automatic recovery in progress, 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo done at 12/F0FFFC38, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 05:02:29.273138+00, 2014-03-19
Re: [PERFORM] Query taking long time
On Wed, Mar 19, 2014 at 10:09 PM, acanada acan...@cnio.es wrote: Hello, First of all I'd like to thank all of you for taking your time and help me with this. Thank you very much. I did migrate the database to the new server with 32 processors Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz and 60GB of RAM. Evegeny pointed that the disks I am using are not fast enough (For data: 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05); and for logging a SAS disk but with only 240GB available, database is 365GB...). I cannot change the locations of data and log since there's not enough space for the data in the SAS disk. Sadly this is a problem that I cannot solve any time soon... The migration had really improved the performance I paste the before and after (the migration) explain analyze, buffers(if aplicable due to server versions) BEFORE: explain analyze select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval; QUERY PLAN -- Sort (cost=18015.66..18027.15 rows=4595 width=139) (actual time=39755.942..39756.246 rows=2845 loops=1) Sort Key: entity2document2.hepval Sort Method: quicksort Memory: 578kB - Bitmap Heap Scan on entity2document2 (cost=116.92..17736.15 rows=4595 width=139) (actual time=45.682..39751.255 rows=2845 loops=1) Recheck Cond: ((name)::text = 'Acetaminophen'::text) - Bitmap Index Scan on entity2document2_name (cost=0.00..115.77 rows=4595 width=0) (actual time=45.124..45.124 rows=2845 loops=1) Index Cond: ((name)::text = 'Acetaminophen'::text) Total runtime: 39756.507 ms AFTER: explain (analyze,buffers) select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval; QUERY PLAN Sort (cost=18434.76..18446.51 rows=4701 width=131) (actual time=9196.634..9196.909 rows=2845 loops=1) Sort Key: entity2document2.hepval Sort Method: quicksort Memory: 604kB Buffers: shared hit=4 read=1725 - Bitmap Heap Scan on entity2document2 (cost=105.00..18148.03 rows=4701 width=131) (actual time=38.668..9190.318 rows=2845 loops=1) Recheck Cond: ((name)::text = 'Acetaminophen'::text) Buffers: shared hit=4 read=1725 - Bitmap Index Scan on entity2documentnew_name (cost=0.00..103.82 rows=4701 width=0) (actual time=30.905..30.905 rows=2845 loops=1) Index Cond: ((name)::text = 'Acetaminophen'::text) Buffers: shared hit=1 read=14 Total runtime: 9197.186 ms The improve is definitely good!!. This is the table that I'm using: \d+ entity2document2; Table public.entity2document2 Column | Type | Modifiers | Storage | Stats target | Description --++---+--+--+- id | integer| not null | plain | | document_id | integer| | plain | | name | character varying(255) | not null | extended | | qualifier| character varying(255) | not null | extended | | tagMethod| character varying(255) | | extended | | created | timestamp(0) without time zone | not null | plain | | updated | timestamp(0) without time zone | | plain | | curation | integer| | plain | | hepval | double precision | | plain | | cardval | double precision | | plain | | nephval | double precision | | plain | | phosval | double precision | | plain | | patternCount | double precision | | plain | | ruleScore| double precision | | plain | | hepTermNormScore | double precision | | plain | | hepTermVarScore | double precision | | plain | | svmConfidence| double precision | | plain | | Indexes: ent_pkey PRIMARY KEY, btree (id) ent_cardval