Simon Riggs wrote: > > I think > > the problem is that the existing proposal can't distinguish between > > these two cases so the user has no idea how to respond to the report. > > If 99.5% of cases are real corruption then there is little need to > distinguish between the cases, nor much value in doing so. The > prevalence of the different error types is critical to understanding how > to respond. > > If a man pulls a gun on you, your first thought isn't "some people > remove guns from their jacket to polish them, so perhaps he intends to > polish it now" because the prevalence of shootings is high, when faced > by people with guns, and the risk of dying is also high. You make a > judgement based upon the prevalence and the risk. > > That is all I am asking for us to do here, make a balanced call. These > recent comments are a change in my own position, based upon evaluating > the prevalence and the risk. I ask others to consider the same line of > thought rather than a black/white assessment. > > All useful detection mechanisms have non-zero false positives because we > would rather sometimes ring the bell for no reason than to let bad > things through silently, as we do now.
OK, but what happens if someone gets the failure report, assumes their hardware is faulty and replaces it, and then gets a failure report again? I assume torn pages are 99% of the reported problem, which are expected and are fixed, and bad hardware 1%, quite the opposite of your numbers above. What might be interesting is to report CRC mismatches if the database was shut down cleanly previously; I think in those cases we shouldn't have torn pages. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers