On Tue, Mar 20, 2018 at 09:54:07AM +0100, Christoph Berg wrote: > Otherwise, +1 from me.
I have been thinking about this patch lately, and on the contrary I am voting -1 for the concepts behind this patch. pg_dump is by nature a tool aimed at fetching data from the database, at putting it in a shape wanted by the user, and optionally at saving the data and at making it durable (since recently for the last part). It is not a corruption detection tool. Even if it was a tool to detect corruption, it is doing it wrong in two ways: 1) It scans tables using only sequential scans, so it basically never checks any other AMs than heap. 2) It detects only one failure at a time and stops. Hence in order to detect all corruptions, one need to run pg_dump, repair or zero the pages and then rince and repeat until a successful run is achieved. This is a costly process particularly on large relations, where a run of pg_dump can take minutes, and the more the pages, the more time it takes to do the whole cleanup before being able to save as much data as possible. Now, why are people using pg_dump > /dev/null? Mainly the lack of better tools, which would be actually able to detect pages in corrupted pages in one run, and not only heap pages. I honestly think that amcheck is something that we sould more focus on and has more potential on the matter, and that we are just complicating pg_dump to do something it is not designed for, and would do it badly anyway. -- Michael
signature.asc
Description: PGP signature