We recently experienced crash on out postgres production server. Here's our server environment:
- Postgres 9.3 - in OpenVZ container - total memory: 64GB Here's the error snippet from postgres log: ERROR: could not read block 356121 in file "base/33134/33598.2": Bad address LOG: server process (PID 21119) was terminated by signal 7: Bus error WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted; last known up at 2013-12-03 08:47:06 UTC LOG: database system was not properly shut down; automatic recovery in progress UTC FATAL: the database system is in recovery mode LOG: checkpoint complete: wrote 10499 buffers (0.7%); 0 transaction log file(s) added, 0 removed, 4 recycled; write=0.215 s, sync=11.405 s, total=11.631 FATAL: the database system is in recovery mode LOG: database system is ready to accept connections Can anyone suggests whether this is critical error? Does it indicate any data corruption in postgres? Although we think this is unlikely related, but this is what we did few hours before the crash: (1) Try to improve query performance by tweaking this: a) shared_buffer: 8GB -> 16GB b) effective_cache_size: 16GB -> 32GB c) random_page_cost: 4 -> 2 d) restart postgres (2) Due to no obvious improvement in performace, change the setting in (1) back to before & restart Thanks if anyone has any insight. regards, shuwn yuan
