Hey folks, 
 I have postgres server running on ubuntu 12, Intel Xeon 8 CPUs 29 GB RAM.
With following settings:
max_connections = 550
shared_buffers = 12GB
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 50MB
maintenance_work_mem = 1GB
fsync = on
wal_buffers = 16MB
commit_delay = 50  
commit_siblings = 7
checkpoint_segments = 32
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
autovacuum = on
autovacuum_vacuum_threshold = 1800
autovacuum_analyze_threshold = 900

I am doing a lot of writes to DB in 40 different threads ­ so every thread
check if record exists ­ if not => insert record, if exists => update
record. 
During this update, my disk IO almost always ­ 100% and sometimes it crash
my DB with following message:

2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
terminated by signal 9: Killed
2013-08-19 03:18:00 UTC LOG:  terminating any other active server processes
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect
to the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect
to the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.

My DB size is not very big ­ 169GB.

Anyone know how can I get rid of DB crash  ?


Thanks,
  Dzmitry



Reply via email to