On Fri, Mar 8, 2013 at 5:46 PM, Josh Berkus <j...@agliodbs.com> wrote:
> After some examination of the systems involved, we conculded that the
> issue was the FreeBSD drivers for the new storage, which were unstable
> and had custom source patches.  However, without PostgreSQL checksums,
> we couldn't *prove* it wasn't PostgreSQL at fault.  It ended up taking
> weeks of testing, most of which was useless, to prove to them they had a
> driver problem so it could be fixed.  If Postgres had had checksums, we
> could have avoided wasting a couple weeks looking for non-existant
> PostgreSQL bugs.

How would Postgres checksums have proven that?

A checksum failure just means *something* has gone wrong. it could
still be Postgres that's done it. In fact I would hazard that checksum
failures would be the way most Postgres bugs will be found at some
point.

> Also, I'm kinda embarassed that, at this point, InnoDB has checksums and
> we don't.  :-(

As much as it sounds silly I think this is a valid argument. Not just
InnoDB but Oracle and other database and even other storage software.

I think even if the patch doesn't get accepted this go around it'll be
in the next release. Either we'll think of solutions for some of the
performance bottlenecks, we'll iron out the transition so you can turn
it off and on freely, or we'll just realize that people are running
with the patch and life is ok even with these problems.

If i understand the performance issues right the main problem is the
extra round trip to the wal log which can require a sync. Is that
right? That seems like a deal breaker to me. I would think an 0-10%
i/o bandwidth or cpu bandwidth penalty would be acceptable but an
extra rotational latency even just on some transactions would be a
real killer.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to