top and sar says 100% cpu usage of one core, no sign of I/O wait. The
database is 1.5TB in size. RAM in master is 145GB, on slave it's differ,
some has about 16GB another has 145GB also.

nothing suspicious on standby's postgres log.

on master's postgres log :
WARNING,01000,"pgstat wait timeout",,,,,,,,,""
ERROR,57014,"canceling autovacuum task",,,,,"automatic vacuum of
table ""consprod._consprod_replication.sl_event""",,,,""
ERROR,57014,"canceling statement due to statement timeout",,,,,,"
"PARSE",2014-06-26 00:39:35 CDT,91/0,0,ERROR,25P02,"current transaction is
aborted, commands ignored until end of transaction block",,,,,,"select
1",,,""
"could not receive data from client: Connection reset by peer",,,,,,,,,""

the log files is big anyway. if you can specify some pattern to look at the
log, that would really help.


On Sun, Jun 29, 2014 at 3:31 PM, Heikki Linnakangas <hlinnakan...@vmware.com
> wrote:

> On 06/29/2014 11:14 AM, Soni M wrote:
>
>> Everything works fine until on Thursday we have high load on master, and
>> after that every streaming replica lag further behind the master. Even on
>> night and weekend where all server load is low. But the slony slave is OK
>> at all.
>>
>
> What does 'top' on the standby say? Is the startup process using 100% of
> (one) CPU replaying records, or is it waiting for I/O? How large is the
> database, does it fit in RAM? Any clues in the system or PostgreSQL logs?
>
> - Heikki
>
>


-- 
Regards,

Soni Maula Harriz

Reply via email to