I have proceeded to do some more checking, and I see in iostat that the pg_xlog 
drive has a significantly higher busy state than before.  Whereas it was barely 
busy when we first spun up the server (total %busy since we started the server 
is about 6%) it is now in its 80's almost steady state.  We have a set of 
partitioned tables which are continuously updated, and based on the size of 
them they no longer fit in the shared memory which was allocated.  Pg_xlog is 
in a SAS RAID 1.  The server is set up with streaming replication to an 
identical server.  The one thing which I just checked is the RAID mode on the 
server.
db1# megacli64 -LDInfo -lAll -aAll
                                     

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 2.452 TB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives per span:2
Span Depth          : 6
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Access Policy       : Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None


Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 418.656 GB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Access Policy       : Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None

Is it possible that the WriteThrough is what is causing the high io (and maybe 
the pgstat wait timeouts as well)?
If this is the case, would it be safe to change the cache to Write back?

Additionally, and somewhat unrelated, is there anything special which I need to 
do when restarting the primary server vis-à-vis the streaming replication 
server?  In other words, if I were to restart the main server, will the 
streaming replication server reconnect and pick up once the primary comes 
online?


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Reply via email to