On Thu, Jul 7, 2016 at 11:45 AM, Pete Stevenson <etep.nosnev...@gmail.com> wrote: > Hi postgresql hackers - > > I would like to find some analysis (published work, blog posts) on the > overheads affiliated with the guarantees provided by MVCC isolation. More > specifically, assuming the current workload is CPU bound (as opposed to IO) > what is the CPU overhead of generating the WAL, the overhead of version > checking and version creation, and of garbage collecting old and unnecessary > versions? For what it’s worth, I am working on a research project where it is > envisioned that some of this work can be offloaded.
That's going to be hard to measure. First, what you didn't say is, 'with respect to what?'. You mention WAL for example. WAL is more of a crash safety mechanism than anything and it's not really fair to include it in an analysis of 'MVCC overhead', or at least not completely. One thing that MVCC *does* objectively cause is bloat, although you can still get bloat without MVCC if you (for example) delete rows or rewrite rows such that they can't fit in their old slot. MVCC definitely incurs some runtime overhead to check visibility but the amount of overhead is highly dependent on the specific workload. Postgres 'hint bits' reduce the cost to near zero for many workloads but in other workloads they are expensive to maintain and cause a lot of extra traffic. One nice feature about not having to worry about visibility is that you can read data directly out of the index. We have some workarounds to deal with that ('all visible bit') but again the amount of benefit from that strategy is going to be very situation specific. Stepping back, the overhead of MVCC in postgres (and probably other systems too) has been continually reduced over the years -- the really nasty parts have been relegated to background cleanup processing. That processing is pretty sequential and the 'i/o bottleneck' is finally getting solved on cheap storage pushing things back into the cpu space. In summary, I think the future of MVCC and transactional systems is very bright, and the data management systems that discard transactional safety in order to get some short term performance gains is, uh, not so bright. Transactions are essential in systems where data integrity matters. merlin -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers