* Tom Lane (t...@sss.pgh.pa.us) wrote: > Andres Freund <and...@anarazel.de> writes: > > Sure, it might be easy, but we don't have it. Personally I think > > checksums just aren't even ready for prime time. If we had: > > - ability to switch on/off at runtime (early patches for that have IIRC > > been posted) > > - *builtin* tooling to check checksums for everything > > - *builtin* tooling to compute checksums after changing setting > > - configurable background sweeps for checksums > > Yeah, and there's a bunch of usability tooling that we don't have, > centered around "what do you do after you get a checksum error?". > AFAIK there's no way to check or clear such an error; but without > such tools, I'm afraid that checksums are as much of a foot-gun > as a benefit.
Uh, ignore_checksum_failure and zero_damanged_pages ...? Not that I'd suggest flipping those on for your production database the first time you see a checksum failure, but we aren't completely without a way to address such cases. Or the to-be-implemented ability to disable checksums for a cluster. > I think developing all this stuff is a good long-term activity, > but I'm hesitant about turning checksums loose on the average > user before we have it. What I dislike about this stance is that it just means we're going to have more and more systems out there that won't have checksums enabled, and there's not going to be an easy way to fix that. > To draw a weak analogy, our checksums right now are more or less > where our replication was in 9.1 --- we had it, but there were still > an awful lot of rough edges. It was only just in this past release > cycle that we started to change default settings towards the idea > that they should support replication by default. I think the checksum > tooling likewise needs years of maturation before we can say that it's > realistically ready to be the default. Well, checksums were introduced in 9.3, which would mean that this is really only being proposed a year earlier than the replication timeline case, if I'm following correctly. I do agree that checksums have not seen quite as much love as the replicaiton work, though I'm tempted to argue that's because they aren't an "interesting" feature now that we've got them- but even those uninteresting features really need someone to champion them when they're important. Unfortunately, our situation with checksums does make me feel a bit like they were added to satisfy a check-box requirement and then not really developed very much further. Following your weak analogy though, it's not like users are going to have to dump/restore their entire cluster to change their systems to take advantage of the new replication capabilities. Thanks! Stephen
Description: Digital signature