On 03/08/16 02:27, Robert Haas wrote:
Personally, I think that incremental surgery on our current heap
format to try to fix this is not going to get very far. If you look
at the history of this, 8.3 was a huge release for timely cleanup of
dead tuple. There was also significant progress in 8.4 as a result of
5da9da71c44f27ba48fdad08ef263bf70e43e689. As far as I can recall, we
then made no progress at all in 9.0 - 9.4. We made a very small
improvement in 9.5 with 94028691609f8e148bd4ce72c46163f018832a5b, but
that's pretty niche. In 9.6, we have "snapshot too old", which I'd
argue is potentially a large improvement, but it was big and invasive
and will no doubt pose code maintenance hazards in the years to come;
also, many people won't be able to use it or won't realize that they
should use it. I think it is likely that further incremental
improvements here will be quite hard to find, and the amount of effort
will be large relative to the amount of benefit. I think we need a
new storage format where the bloat is cleanly separated from the data
rather than intermingled with it; every other major RDMS works that
way. Perhaps this is a case of "the grass is greener on the other
side of the fence", but I don't think so.
Yeah, I think this is a good summary of the state of play.
The only other new db development to use a non-overwriting design like
ours that I know of was Jim Starky's Falcon engine for (ironically)
Mysql 6.0. Not sure if anyone is still progressing that at all now.
I do wonder if Uber could have successfully tamed dead tuple bloat with
aggressive per-table autovacuum settings (and if in fact they tried),
but as I think Robert said earlier, it is pretty easy to come up with a
highly update (or insert + delete) workload that makes for a pretty ugly
bloat component even with real aggressive autovacuuming.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: