One of servers under my support 2 days ago produced the next error: ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
Server version 9.0.6. No db or server crashes or db recovery happen on that server since setup. Server is db backend for very large web project under quite a heavy load (10k+ request per second to db). No other strange errors found in the logs since server was put to production half year ago. Now what happened (from DB log): 2012-02-17 22:35:58 MSK 14333 [vxid:340/18229999 txid:2341883282] [DELETE] LOG: duration: 5.669 ms execute <unnamed>: delete from agency_statistics 2012-02-17 22:35:58 MSK 14333 [vxid:340/18230000 txid:0] [BIND] LOG: duration: 2.787 ms bind <unnamed>: insert into agency_statistics (employer_id, area_id, area_pa th, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer _id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join spec ialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id 2012-02-17 22:35:59 MSK 14333 [vxid:340/18230000 txid:2341883336] [INSERT] LOG: duration: 726.700 ms execute <unnamed>: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join specialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id 2012-02-17 22:35:59 MSK 14333 [vxid:340/18230070 txid:0] [BIND] LOG: duration: 30.195 ms bind S_35: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0) 2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes 2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] STATEMENT: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0) No concurrent writes was happened to that table outside of that connection. May be autovacuum were fired on that table between delete and first or second inserts (unfortunately it wasn't logged). I recreated table from scratch and keep the damaged table under another name (through alter table agency_statistics rename to agency_statistics_old). So I have files to dig into. What I see in file system: hh=# SELECT relfilenode from pg_class where relname='agency_statistics_old'; relfilenode ------------- 118881486 postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881486 -rw------- 1 postgres postgres 0 2012-02-20 12:04 /var/lib/postgresql/9.0/main/base/16404/118881486 So table file size zero bytes (seems autovacuum truncated that table to 0 bytes). But index show: hh=# SELECT relfilenode from pg_class where relname='agency_statistics_pkey'; relfilenode ------------- 118881489 postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881489 -rw------- 1 postgres postgres 4956160 2012-02-20 12:07 /var/lib/postgresql/9.0/main/base/16404/118881489 hh=# SELECT * from pgstatindex('agency_statistics_pkey'); -[ RECORD 1 ]------+-------- version | 2 tree_level | 2 index_size | 4947968 root_block_no | 295 internal_pages | 1 leaf_pages | 601 empty_pages | 0 deleted_pages | 1 avg_leaf_density | 0.45 leaf_fragmentation | 13.14 I out of ideas now. Any suggestions where and what I should look next? Kind Regards, Maksym -- Maxim Boguk Senior Postgresql DBA. Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 Skype: maxim.boguk Jabber: maxim.bo...@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk "If they can send one man to the moon... why can't they send them all?" МойКруг: http://mboguk.moikrug.ru/ "People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage."