* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Jan 25, 2017 at 6:30 PM, Stephen Frost <sfr...@snowman.net> wrote:
> > I hope to discuss it further after we have the ability to turn it off
> > easily.
> I think we should have the ability to flip it in BOTH directions easily.

Presumably you imply this to mean "before we enable it by default."  I'm
not sure that I can agree with that, but we haven't got it in either
direction yet, so it's not terribly interesting to discuss that
particular "what if."

> It sounds to me like you are misleading users about the positives and
> negatives of checksums, which then causes them to be shocked that they
> are not the default.

I don't try to claim that they are without downsides or performance
impacts, if that's the implication here.

> > [ more unsolicited bragging an unspecified backup tool, presumably still 
> > pgbackrest ]

It was explicitly to counter the claim that there aren't things out
there which are working to actively check the checksums.

> > I'd rather walk into an engagement where the user is saying "yeah, we
> > enabled checksums and it caught this corruption issue" than having to
> > break the bad news, which I've had to do over and over, that their
> > existing system hasn't got checksums enabled.  This isn't hypothetical,
> > it's what I run into regularly with entirely reasonable and skilled
> > engineers who have been deploying PG.
> Maybe you should just stop telling them and use the time thus freed up
> to work on improving the checksum feature.

I'm working to improve the usefulness of our checksum feature in a way
which will produce practical and much more immediate results than
anything I could do today in PG.  That said, I do plan to also support
working on checksums as I'm able to.  At the moment, that's supporting
Magnus' thread about enabling them by default.  I'd be a bit surprised
if he was trying to force a change on PG because he thinks it's going to
improve things for pgbackrest, but if so, I'm not going to complain when
it seems like an entirely sensible and good change which will benefit
PG's users too.

Even better would be if we had an independent tool to check checksums
endorsed by the PG community, but that won't happen for a release cycle.
I'd also be extremely happy if the other backup tools out there grew
the ability to check checksums in PG pages; frankly, I hope that adding
it to pgbackrest will push them to do so.

> I'm skeptical of this whole discussion because you seem to be filled
> with unalloyed confidence that checksums have little performance
> impact and will do wonderful things to prevent data loss, whereas I
> think they have significant performance impact and will only very
> slightly help to prevent data loss.  

I admit that they'll have a significant performance impact in some
environments, but I think the vast majority of installations won't see
anything different, while some of them may be saved by it, including, as
likely as not, a number of actual corruption issues that have been
brought up on these lists in the past few days, simply because reports
were asked for.

> I admit that the idea of having
> pgbackrest verify checksums while backing up seems like it could
> greatly improve the chances of checksums being useful, but I'm not
> going to endorse changing PostgreSQL's default for pgbackrest's
> benefit.  

I'm glad to hear that you generally endorse the idea of having a backup
tool verify checksums.  I'd love it if all of them did and I'm not going
to apologize for, as far as I'm aware, being the first to even make an
effort in that direction.

> It's got to be to the benefit of PostgreSQL users broadly,
> not just the subset of those people who use one particular backup
> tool.  

Hopefully, other backup solutions will add similar capability, and
perhaps someone will also write an independent tool, and eventually
those will get out in released versions, and maybe PG will grow a tool
to check checksums too, but I can't make other tool authors implement
it, nor can I make other committers work on it and while I'm doing what
I can, as I'm sure you understand, we all have a lot of different hats.

> Also, the massive hit that will probably occur on
> high-concurrency OLTP workloads larger than shared_buffers is going to
> be had to justify for any amount of backup security.  I think that
> problem's got to be solved or at least mitigated before we think about
> changing this.  I realize that not everyone would set the bar that
> high, but I see far too many customers with exactly that workload to
> dismiss it lightly.

I have a sneaking suspicion that the customers which you get directly
involved with tend to be at a different level than the majority of PG
users which exist out in the wild (I can't say that it's really any
different for me).  I don't think that's a bad thing, but I do think
users at all levels deserve consideration and not just those running
close to the limits of their gear.



Attachment: signature.asc
Description: Digital signature

Reply via email to