[PERFORM] slave wal is ahead of master

2014-03-19 Thread Evgeniy Shishkin
Hello, 

we have 3 servers with postgresql 9.3.3. One is master and two slaves.
We run synchronous_replication and fsync, synchronous_commit and 
full_page_writes are on.

Suddenly master hang up with hardware failure, it is a strange bug in iLo which 
we investigate with HP.

Before master was rebooted, i ran ps aux on slave
postgres: wal receiver process   streaming 12/F1031DF8

Last messages in slaves logs was
2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 
GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed 
transaction was at log time 2014-03-19 02:41:28.886869+00

and then there was silence, because master hang.

Then master was rebooted and slave wrote in log
2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 
GMT,,0,FATAL,XX000,terminating walreceiver due to timeout,
2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 
GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8,
2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 
GMT,,0,FATAL,XX000,could not connect to the primary server: could not connect 
to server: No route to host
Is the server running on host 10.162.2.50 and accepting
TCP/IP connections on port 5432?
,

Then master finally came back, slave wrote
2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 
GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL:  the 
database system is starting up
,
2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 
GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 
1,
2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 
GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR:  requested 
starting point 12/F100 is ahead of the WAL flush position of this server 
12/F0FFFCE8
,

last message was repeated several times
and then this happened

2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 
GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 
1,
2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 
GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 
GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator 
command,
2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 
GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 
GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 
GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,” 
and it just repeats forever.


Meanwhile on master
2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 
GMT,,0,LOG,0,database system was not properly shut down; automatic 
recovery in progress,
2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 
GMT,,0,LOG,0,redo starts at 12/DFFBB3E8,
2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 
GMT,,0,LOG,0,redo done at 12/F0FFFC38,
2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 
GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 
05:02:29.273138+00,
2014-03-19 15:39:47.115 GMT,,,7115,,5329ba32.1bcb,6,,2014-03-19 15:39:30 
GMT,,0,LOG,0,checkpoint starting: end-of-recovery immediate,
2014-03-19 15:40:16.466 
GMT,replicator,,7986,10.162.2.52:44336,5329ba5e.1f32,1,idle,2014-03-19 
15:40:14 GMT,2/0,0,ERROR,XX000,requested starting point 12/F100 is ahead 
of the WAL flush position of this server 12/F0FFFCE8,walreceiver

So, all two slaves are disconnected from master, which somehow is past his 
slaves.

I decided to promote one of the slaves, so we can have some snapshot of the 
data.
relevant logs from this are 
2014-03-19 16:50:43.118 GMT,,,,,5329cae3.115c,3,,2014-03-19 16:50:43 
GMT,,0,LOG,0,redo starts at 12/DFFBB3E8,
2014-03-19 16:50:50.028 
GMT,dboperator,postgres,4452,[local],5329caea.1164,1,,2014-03-19 
16:50:50 GMT,,0,FATAL,57P03,the database system is starting up,
2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,4,,2014-03-19 16:50:43 
GMT,,0,LOG,0,invalid contrecord length 5736 at 12/F0FFFC80,
2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,5,,2014-03-19 16:50:43 
GMT,,0,LOG,0,redo done at 12/F0FFFC38,””

It is interesting that redo done at 12/F0FFFC38 both on master and promoted 
slave.

The main question is there is actual latest data, and how is it possible that 
master is behind his slave in synchronous replication.

Thanks for the help.

-- 
Sent via 

Re: [PERFORM] slave wal is ahead of master

2014-03-19 Thread Sethu Prasad
fsync
... taking more knowledge around this will shed some light to understand
this problem slave ahead of master

there was silence, because master hang.
... replication halted here, master holds the latest copy which is missing
at both the slaves

I decided to promote one of the slaves
... only 2 slaves are left, and one among them is going to be the master
now, in master/slave nomenclature the data from master is considered as the
valid one from this point onward

master is behind his slave
... you mentioned that the original master comes up in the mean time one of
the slave was already a master


On Wed, Mar 19, 2014 at 10:40 PM, Evgeniy Shishkin itparan...@gmail.comwrote:

 Hello,

 we have 3 servers with postgresql 9.3.3. One is master and two slaves.
 We run synchronous_replication and fsync, synchronous_commit and
 full_page_writes are on.

 Suddenly master hang up with hardware failure, it is a strange bug in iLo
 which we investigate with HP.

 Before master was rebooted, i ran ps aux on slave
 postgres: wal receiver process   streaming 12/F1031DF8

 Last messages in slaves logs was
 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28
 13:17:29 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last
 completed transaction was at log time 2014-03-19
 02:41:28.886869+00

 and then there was silence, because master hang.

 Then master was rebooted and slave wrote in log
 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29
 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout,
 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29
 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8,
 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54
 GMT,,0,FATAL,XX000,could not connect to the primary server: could not
 connect to server: No route to host
 Is the server running on host 10.162.2.50 and accepting
 TCP/IP connections on port 5432?
 ,

 Then master finally came back, slave wrote
 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09
 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL:  the
 database system is starting up
 ,
 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14
 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on
 timeline 1,
 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14
 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR:
  requested starting point 12/F100 is ahead of the WAL flush position of
 this server 12/F0FFFCE8
 ,

 last message was repeated several times
 and then this happened

 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04
 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on
 timeline 1,
 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04
 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator
 command,
 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 and it just repeats forever.


 Meanwhile on master
 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30
 GMT,,0,LOG,0,database system was not properly shut down; automatic
 recovery in progress,
 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30
 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8,
 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30
 GMT,,0,LOG,0,redo done at 12/F0FFFC38,
 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30
 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19
 05:02:29.273138+00,
 2014-03-19 15:39:47.115 GMT,,,7115,,5329ba32.1bcb,6,,2014-03-19 15:39:30
 GMT,,0,LOG,0,checkpoint starting: end-of-recovery immediate,
 2014-03-19 15:40:16.466 
 GMT,replicator,,7986,10.162.2.52:44336,5329ba5e.1f32,1,idle,2014-03-19
 15:40:14 GMT,2/0,0,ERROR,XX000,requested starting point 12/F100 is
 ahead of the WAL flush position of this server
 12/F0FFFCE8,walreceiver

 So, all two slaves are disconnected from master, which somehow is past his
 slaves.

 I decided to promote one of the slaves, so we can have some snapshot of
 the data.
 relevant logs from this are
 2014-03-19 16:50:43.118 GMT,,,,,5329cae3.115c,3,,2014-03-19 16:50:43
 GMT,,0,LOG,0,redo starts 

Re: [PERFORM] slave wal is ahead of master

2014-03-19 Thread Evgeny Shishkin

On 19 Mar 2014, at 21:11, Sethu Prasad sethuprasad...@gmail.com wrote:

 fsync
 ... taking more knowledge around this will shed some light to understand this 
 problem slave ahead of master
 
 there was silence, because master hang.
 ... replication halted here, master holds the latest copy which is missing at 
 both the slaves
 

Master hang in very strange way. I could ssh to it, and see dmesg, but not any 
other command. Also tcp connection was alive to slaves.
So we can’t say that slaves did not receive data.


 I decided to promote one of the slaves
 ... only 2 slaves are left, and one among them is going to be the master now, 
 in master/slave nomenclature the data from master is considered as the valid 
 one from this point onward

There was no any failover procedure. Slaves was slaves. 


 
 master is behind his slave
 ... you mentioned that the original master comes up in the mean time one of 
 the slave was already a master
 

No, after we rebooted master slaves did not reconnected to it.
Later i stopped replication on one of the slaves to preserve data its state.


So the main question is, under which circumstances slaves can not reconnect to 
master with error that master is behind.
With fsync on, and synchronous* on.

 
 On Wed, Mar 19, 2014 at 10:40 PM, Evgeniy Shishkin itparan...@gmail.com 
 wrote:
 Hello,
 
 we have 3 servers with postgresql 9.3.3. One is master and two slaves.
 We run synchronous_replication and fsync, synchronous_commit and 
 full_page_writes are on.
 
 Suddenly master hang up with hardware failure, it is a strange bug in iLo 
 which we investigate with HP.
 
 Before master was rebooted, i ran ps aux on slave
 postgres: wal receiver process   streaming 12/F1031DF8
 
 Last messages in slaves logs was
 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 
 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed 
 transaction was at log time 2014-03-19 02:41:28.886869+00
 
 and then there was silence, because master hang.
 
 Then master was rebooted and slave wrote in log
 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 
 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout,
 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 
 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8,
 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 
 GMT,,0,FATAL,XX000,could not connect to the primary server: could not 
 connect to server: No route to host
 Is the server running on host 10.162.2.50 and accepting
 TCP/IP connections on port 5432?
 ,
 
 Then master finally came back, slave wrote
 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 
 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL:  the 
 database system is starting up
 ,
 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 
 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on 
 timeline 1,
 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 
 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR:  requested 
 starting point 12/F100 is ahead of the WAL flush position of this server 
 12/F0FFFCE8
 ,
 
 last message was repeated several times
 and then this happened
 
 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 
 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on 
 timeline 1,
 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 
 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator 
 command,
 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,
 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 
 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,”
 and it just repeats forever.
 
 
 Meanwhile on master
 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 
 GMT,,0,LOG,0,database system was not properly shut down; automatic 
 recovery in progress,
 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 
 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8,
 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 
 GMT,,0,LOG,0,redo done at 12/F0FFFC38,
 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 
 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 
 05:02:29.273138+00,
 2014-03-19 

Re: [PERFORM] Query taking long time

2014-03-19 Thread Venkata Balaji Nagothi
On Wed, Mar 19, 2014 at 10:09 PM, acanada acan...@cnio.es wrote:

Hello,

 First of all I'd like to thank all of you for taking your time and help me
 with this. Thank you very much.

 I did migrate the database to the new server with 32 processors Intel(R)
 Xeon(R) CPU E5-2670 0 @ 2.60GHz  and 60GB of RAM.
 Evegeny pointed that the disks I am using are not fast enough (For
 data: 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series
 chipset SATA RAID Controller (rev 05); and for logging a SAS disk but with
 only 240GB available, database is 365GB...). I cannot change the locations
 of data and log since there's not enough space for the data in the SAS
 disk.  Sadly this is a problem that I cannot solve any time soon...

 The migration had really improved the performance
 I paste the before and after (the migration) explain analyze, buffers(if
 aplicable due to server versions)

 BEFORE:
 explain analyze select * from (select * from entity2document2  where
 name='Acetaminophen' ) as a  order by a.hepval;
   QUERY
 PLAN

 --
  Sort  (cost=18015.66..18027.15 rows=4595 width=139) (actual
 time=39755.942..39756.246 rows=2845 loops=1)
Sort Key: entity2document2.hepval
Sort Method:  quicksort  Memory: 578kB
-  Bitmap Heap Scan on entity2document2  (cost=116.92..17736.15
 rows=4595 width=139) (actual time=45.682..39751.255 rows=2845 loops=1)
  Recheck Cond: ((name)::text = 'Acetaminophen'::text)
  -  Bitmap Index Scan on entity2document2_name
  (cost=0.00..115.77 rows=4595 width=0) (actual time=45.124..45.124
 rows=2845 loops=1)
Index Cond: ((name)::text = 'Acetaminophen'::text)
  Total runtime: 39756.507 ms

  AFTER:
  explain (analyze,buffers) select * from (select * from entity2document2
  where name='Acetaminophen' ) as a  order by a.hepval;
QUERY
 PLAN

 
  Sort  (cost=18434.76..18446.51 rows=4701 width=131) (actual
 time=9196.634..9196.909 rows=2845 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort  Memory: 604kB
Buffers: shared hit=4 read=1725
-  Bitmap Heap Scan on entity2document2  (cost=105.00..18148.03
 rows=4701 width=131) (actual time=38.668..9190.318 rows=2845 loops=1)
  Recheck Cond: ((name)::text = 'Acetaminophen'::text)
  Buffers: shared hit=4 read=1725
  -  Bitmap Index Scan on entity2documentnew_name
  (cost=0.00..103.82 rows=4701 width=0) (actual time=30.905..30.905
 rows=2845 loops=1)
Index Cond: ((name)::text = 'Acetaminophen'::text)
Buffers: shared hit=1 read=14
  Total runtime: 9197.186 ms

 The improve is definitely good!!.
 This is the table that I'm using:
 \d+ entity2document2;
 Table public.entity2document2
   Column  |  Type  | Modifiers | Storage
  | Stats target | Description

 --++---+--+--+-
  id   | integer| not null  | plain
  |  |
  document_id  | integer|   | plain
  |  |
  name | character varying(255) | not null  | extended
 |  |
  qualifier| character varying(255) | not null  | extended
 |  |
  tagMethod| character varying(255) |   | extended
 |  |
  created  | timestamp(0) without time zone | not null  | plain
  |  |
  updated  | timestamp(0) without time zone |   | plain
  |  |
  curation | integer|   | plain
  |  |
  hepval   | double precision   |   | plain
  |  |
  cardval  | double precision   |   | plain
  |  |
  nephval  | double precision   |   | plain
  |  |
  phosval  | double precision   |   | plain
  |  |
  patternCount | double precision   |   | plain
  |  |
  ruleScore| double precision   |   | plain
  |  |
  hepTermNormScore | double precision   |   | plain
  |  |
  hepTermVarScore  | double precision   |   | plain
  |  |
  svmConfidence| double precision   |   | plain
  |  |
 Indexes:
 ent_pkey PRIMARY KEY, btree (id)
 ent_cardval