Re: [HACKERS] MVCC overheads

2016-07-08 Thread Gavin Flower

Please see comment at the bottom of this post.

On 08/07/16 10:48, Pete Stevenson wrote:
Good info, thanks for the note. Agreed that it is difficult to pull 
things apart to isolate these features for offload — so actually 
running experiments with offload is not possible, as you point out 
(and for other reasons).


Maybe I could figure out the lines of code that add versions into a 
table and then those that collect old versions (they do get collected, 
right?). Anyway, thought being I could profile while running TPC-C or 
similar. I was hoping that someone might be able to jump on this with 
a response that they already did something similar. I know that 
Stonebraker has done some analysis along these lines, but I’m looking 
for an independent result that confirms (or not) his work.


Thank you,
Pete Stevenson


On Jul 7, 2016, at 3:43 PM, Simon Riggs > wrote:


On 7 July 2016 at 20:50, Pete Stevenson > wrote:


Hi Simon -

Thanks for the note. I think it's fair to say that I didn't
provide enough context, so let me try and elaborate on my question.

I agree, MVCC is a benefit. The research angle here is about
enabling MVCC with hardware offload. Since I didn't explicitly
mention it, the offload I refer to will respect all consistency
guarantees also.

It is the case that for the database to implement MVCC it must
provide consistent read to multiple different versions of data,
i.e. depending on the version used at transaction start. I'm not
an expert on postgresql internals, but this must have some cost.
I think the cost related to MVCC guarantees can roughly be
categorized as: creating new versions (linking them in), version
checking on read, garbage collecting old versions, and then there
is an additional cost that I am interested in (again not claiming
it is unnecessary in any sense) but there is a cost to generating
the log.

Thanks, by the way, for the warning about lab vs. reality. That's
why I'm asking this question here. I want to keep the
hypothetical tagged as such, but find defensible and realistic
metrics where those exist, i.e. in this instance, we do have a
database that can use MVCC. It should be possible to figure out
how much work goes into maintaining that property.


PostgreSQL uses a no overwrite storage mechanism, so any additional 
row versions are maintained in the same table alongside other rows. 
The MVCC actions are mostly mixed in with other aspects of the 
storage, so not isolated much for offload.


Oracle has a different mechanism that does isolate changed row 
versions into a separate data structure, so would be much more 
amenable to offload than PostgreSQL, in its current form.


Maybe look at SLRUs (clog etc) as a place to offload something?

--
Simon Riggs http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


In this list, the convention is to post replies at the end (with some 
rare exceptions), or interspersed when appropriate, and to omit parts no 
longer relevant.


The motivation of bottom posting like this: is that people get to see 
the context before the reply, AND emails don't end up getting longer & 
longer as people reply at the beginning forgetting to trim the now 
irrelevant stuff at the end.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Alvaro Herrera
Peter Geoghegan wrote:

> Has anyone ever done any kind of write-up of the "TED" design that was
> discussed during FOSDEM (I hope I recall the name it was given
> correctly)? Apparently that's something that's been discussed a few
> times among senior community members, and I think it has promise.

https://www.postgresql.org/message-id/55511d1f.7050...@iki.fi

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Merlin Moncure
On Thu, Jul 7, 2016 at 11:45 AM, Pete Stevenson
 wrote:
> Hi postgresql hackers -
>
> I would like to find some analysis (published work, blog posts) on the 
> overheads affiliated with the guarantees provided by MVCC isolation. More 
> specifically, assuming the current workload is CPU bound (as opposed to IO) 
> what is the CPU overhead of generating the WAL, the overhead of version 
> checking and version creation, and of garbage collecting old and unnecessary 
> versions? For what it’s worth, I am working on a research project where it is 
> envisioned that some of this work can be offloaded.

That's going to be hard to measure.   First, what you didn't say is,
'with respect to what?'. You mention WAL for example.  WAL is more of
a crash safety mechanism than anything and it's not really fair to
include it in an analysis of 'MVCC overhead', or at least not
completely.  One thing that MVCC *does* objectively cause is bloat,
although you can still get bloat without MVCC if you (for example)
delete rows or rewrite rows such that they can't fit in their old
slot.

MVCC definitely incurs some runtime overhead to check visibility but
the amount of overhead is highly dependent on the specific workload.
Postgres 'hint bits' reduce the cost to near zero for many workloads
but in other workloads they are expensive to maintain and cause a lot
of extra traffic.   One nice feature about not having to worry about
visibility is that you can read data directly out of the index.  We
have some workarounds to deal with that ('all visible bit') but again
the amount of benefit from that strategy is going to be very situation
specific.

Stepping back, the overhead of MVCC in postgres (and probably other
systems too) has been continually reduced over the years -- the really
nasty parts have been relegated to background cleanup processing.
That processing is pretty sequential and the 'i/o bottleneck' is
finally getting solved on cheap storage pushing things back into the
cpu space.

In summary, I think the future of MVCC and transactional systems is
very bright, and the data management systems that discard
transactional safety in order to get some short term performance gains
is, uh, not so bright.  Transactions are essential in systems where
data integrity matters.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Peter Geoghegan
On Fri, Jul 8, 2016 at 11:44 AM, Tom Lane  wrote:
>> Sure, but we could *also* do it separately, splitting VACUUMs tasks of
>> tuple freezing, page compaction, and index entry removal each into
>> separate tasks.
>
> Uh ... wouldn't that tend to make things worse?  The knocks on VACUUM are
> too much I/O and too much latency for cleanup, and I can't see how
> splitting it does anything good on either score.

Has anyone ever done any kind of write-up of the "TED" design that was
discussed during FOSDEM (I hope I recall the name it was given
correctly)? Apparently that's something that's been discussed a few
times among senior community members, and I think it has promise.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Kevin Grittner
On Thu, Jul 7, 2016 at 11:45 AM, Pete Stevenson
 wrote:

> I would like to find some analysis (published work, blog posts)
> on the overheads affiliated with the guarantees provided by MVCC
> isolation.

There are three levels of isolation implemented[1]; the incremental
cost of SERIALIZABLE isolation over REPEATABLE READ for several
standard benchmarking loads is available in section 8 of a paper
presented an a VLDB conference[2].

Hopefully that helps some.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1]  PostgreSQL current online documentation.  Transaction Isolation.
 https://www.postgresql.org/docs/current/static/transaction-iso.html

[2]  Dan R. K. Ports and Kevin Grittner. 2012.
 Serializable Snapshot Isolation in PostgreSQL.
 Proceedings of the VLDB Endowment, Vol. 5, No. 12.
 The 38th International Conference on Very Large Data Bases,
 August 27th - 31st 2012, Istanbul, Turkey.
 http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> VACUUM in itself is an offloading optimization; the whole point of it
>> is to do maintenance in a background process not foreground queries.

> Well, if VACUUM worked so great, we wouldn't get so many trouble reports
> with it.  There's substantial improvement we could make in that area.

Sure, and we've been chipping away at that problem over time, including
some significant improvement in 9.6.  My point is just that it's a good
idea to understand VACUUM as being some pre-existing work that's related
to this offloading idea.

> Sure, but we could *also* do it separately, splitting VACUUMs tasks of
> tuple freezing, page compaction, and index entry removal each into
> separate tasks.

Uh ... wouldn't that tend to make things worse?  The knocks on VACUUM are
too much I/O and too much latency for cleanup, and I can't see how
splitting it does anything good on either score.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Pete Stevenson wrote:
> >> Maybe I could figure out the lines of code that add versions into a
> >> table and then those that collect old versions (they do get collected,
> >> right?). Anyway, thought being I could profile while running TPC-C or
> >> similar. I was hoping that someone might be able to jump on this with
> >> a response that they already did something similar.
> 
> > Old tuple versions are "collected" (removed) by either vacuum (see
> > vacuumlazy.c) and heap_page_prune.  The latter is one thing that could
> > perhaps somehow be offloaded, as it's quite independent from the other
> > stuff.  You can prune removable tuples at no additional cost from an
> > unlocked dirty page, which is a useful optimization because then
> > client-connected backends don't need to prune them later.
> 
> VACUUM in itself is an offloading optimization; the whole point of it
> is to do maintenance in a background process not foreground queries.

Well, if VACUUM worked so great, we wouldn't get so many trouble reports
with it.  There's substantial improvement we could make in that area.

> AFAIR, heap_page_prune is just a small subset of VACUUM work that
> we decided we could afford to do in foreground.

Sure, but we could *also* do it separately, splitting VACUUMs tasks of
tuple freezing, page compaction, and index entry removal each into
separate tasks.

Currently vacuuming a 4TB table can take weeks, meanwhile dead tuples
accumulate in already scanned pages leading to further bloat, leading to
Xid wraparound danger later, emergency vacuuming leading to applications
blocking on DDL.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Tom Lane
Alvaro Herrera  writes:
> Pete Stevenson wrote:
>> Maybe I could figure out the lines of code that add versions into a
>> table and then those that collect old versions (they do get collected,
>> right?). Anyway, thought being I could profile while running TPC-C or
>> similar. I was hoping that someone might be able to jump on this with
>> a response that they already did something similar.

> Old tuple versions are "collected" (removed) by either vacuum (see
> vacuumlazy.c) and heap_page_prune.  The latter is one thing that could
> perhaps somehow be offloaded, as it's quite independent from the other
> stuff.  You can prune removable tuples at no additional cost from an
> unlocked dirty page, which is a useful optimization because then
> client-connected backends don't need to prune them later.

VACUUM in itself is an offloading optimization; the whole point of it
is to do maintenance in a background process not foreground queries.
AFAIR, heap_page_prune is just a small subset of VACUUM work that
we decided we could afford to do in foreground.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Alvaro Herrera
Pete Stevenson wrote:

> Maybe I could figure out the lines of code that add versions into a
> table and then those that collect old versions (they do get collected,
> right?). Anyway, thought being I could profile while running TPC-C or
> similar. I was hoping that someone might be able to jump on this with
> a response that they already did something similar.

Old tuple versions are "collected" (removed) by either vacuum (see
vacuumlazy.c) and heap_page_prune.  The latter is one thing that could
perhaps somehow be offloaded, as it's quite independent from the other
stuff.  You can prune removable tuples at no additional cost from an
unlocked dirty page, which is a useful optimization because then
client-connected backends don't need to prune them later.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC overheads

2016-07-08 Thread Pete Stevenson
Good info, thanks for the note. Agreed that it is difficult to pull things 
apart to isolate these features for offload — so actually running experiments 
with offload is not possible, as you point out (and for other reasons).

Maybe I could figure out the lines of code that add versions into a table and 
then those that collect old versions (they do get collected, right?). Anyway, 
thought being I could profile while running TPC-C or similar. I was hoping that 
someone might be able to jump on this with a response that they already did 
something similar. I know that Stonebraker has done some analysis along these 
lines, but I’m looking for an independent result that confirms (or not) his 
work.

Thank you,
Pete Stevenson


> On Jul 7, 2016, at 3:43 PM, Simon Riggs  wrote:
> 
> On 7 July 2016 at 20:50, Pete Stevenson  > wrote:
> Hi Simon -
> 
> Thanks for the note. I think it's fair to say that I didn't provide enough 
> context, so let me try and elaborate on my question.
> 
> I agree, MVCC is a benefit. The research angle here is about enabling MVCC 
> with hardware offload. Since I didn't explicitly mention it, the offload I 
> refer to will respect all consistency guarantees also.
> 
> It is the case that for the database to implement MVCC it must provide 
> consistent read to multiple different versions of data, i.e. depending on the 
> version used at transaction start. I'm not an expert on postgresql internals, 
> but this must have some cost. I think the cost related to MVCC guarantees can 
> roughly be categorized as: creating new versions (linking them in), version 
> checking on read, garbage collecting old versions, and then there is an 
> additional cost that I am interested in (again not claiming it is unnecessary 
> in any sense) but there is a cost to generating the log.
> 
> Thanks, by the way, for the warning about lab vs. reality. That's why I'm 
> asking this question here. I want to keep the hypothetical tagged as such, 
> but find defensible and realistic metrics where those exist, i.e. in this 
> instance, we do have a database that can use MVCC. It should be possible to 
> figure out how much work goes into maintaining that property.
> 
> PostgreSQL uses a no overwrite storage mechanism, so any additional row 
> versions are maintained in the same table alongside other rows. The MVCC 
> actions are mostly mixed in with other aspects of the storage, so not 
> isolated much for offload.
> 
> Oracle has a different mechanism that does isolate changed row versions into 
> a separate data structure, so would be much more amenable to offload than 
> PostgreSQL, in its current form.
> 
> Maybe look at SLRUs (clog etc) as a place to offload something?
> 
> -- 
> Simon Riggshttp://www.2ndQuadrant.com/ 
> 
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MVCC overheads

2016-07-08 Thread Craig Ringer
On 8 July 2016 at 03:50, Pete Stevenson  wrote:

> Hi Simon -
>
> Thanks for the note. I think it's fair to say that I didn't provide enough
> context, so let me try and elaborate on my question.
>

Please reply in-line in posts to make it easier to follow conversations
with multiple people.


> It is the case that for the database to implement MVCC it must provide
> consistent read to multiple different versions of data, i.e. depending on
> the version used at transaction start.
>

Not necessarily transaction start; for REPEATABLE READ isolation, statement
start is sufficient, or even weaker than that.


> I'm not an expert on postgresql internals, but this must have some cost.
>

Sure it does. Disk space, efficiency of use of RAM for disk cache, CPU cost
of scanning over not-visible tuples, etc.


> I think the cost related to MVCC guarantees can roughly be categorized as:
> creating new versions (linking them in)
>

The way PostgreSQL does that (read the manual) is pretty lightweight. You
will have already found the old tuple so setting its xmax is cheap. Writing
the new tuple costs much the same as an insert.


> version checking on read
>

Yep. In particular, index scans because PostgreSQL doesn't maintain
visibility information in indexes. Read up on PostgreSQL's mvcc
implementation, index scans, index-only scans, visibility map, etc.


> garbage collecting old versions
>

As implemented in PostgreSQL by VACUUM


> and then there is an additional cost that I am interested in (again not
> claiming it is unnecessary in any sense) but there is a cost to generating
> the log.
>

The write-ahead log is orthogonal to MVCC. You can have MVCC without WAL
(or other write durability). You can have write durability without MVCC.
The two are almost entirely unrelated.


> Thanks, by the way, for the warning about lab vs. reality. That's why I'm
> asking this question here. I want to keep the hypothetical tagged as such,
> but find defensible and realistic metrics where those exist, i.e. in this
> instance, we do have a database that can use MVCC. It should be possible to
> figure out how much work goes into maintaining that property.
>

MVCC logic is woven deeply thoughout PostgreSQL. I'm not sure how you'd
even begin to offload it in any meaningful way, nor if it'd be useful to do
so. Presumably you're thinking of some way to tell the storage layer "show
me the table as if it has only rows visible to [this xact]" so Pg doesn't
have to do any checking at all. But it's not always that simple. See:

- Logical decoding (time travel)
- VACUUM
- EvalPlanQual, re-checks of updated rows
- ...


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] MVCC overheads

2016-07-07 Thread Simon Riggs
On 7 July 2016 at 20:50, Pete Stevenson  wrote:

> Hi Simon -
>
> Thanks for the note. I think it's fair to say that I didn't provide enough
> context, so let me try and elaborate on my question.
>
> I agree, MVCC is a benefit. The research angle here is about enabling MVCC
> with hardware offload. Since I didn't explicitly mention it, the offload I
> refer to will respect all consistency guarantees also.
>
> It is the case that for the database to implement MVCC it must provide
> consistent read to multiple different versions of data, i.e. depending on
> the version used at transaction start. I'm not an expert on postgresql
> internals, but this must have some cost. I think the cost related to MVCC
> guarantees can roughly be categorized as: creating new versions (linking
> them in), version checking on read, garbage collecting old versions, and
> then there is an additional cost that I am interested in (again not
> claiming it is unnecessary in any sense) but there is a cost to generating
> the log.
>
> Thanks, by the way, for the warning about lab vs. reality. That's why I'm
> asking this question here. I want to keep the hypothetical tagged as such,
> but find defensible and realistic metrics where those exist, i.e. in this
> instance, we do have a database that can use MVCC. It should be possible to
> figure out how much work goes into maintaining that property.
>

PostgreSQL uses a no overwrite storage mechanism, so any additional row
versions are maintained in the same table alongside other rows. The MVCC
actions are mostly mixed in with other aspects of the storage, so not
isolated much for offload.

Oracle has a different mechanism that does isolate changed row versions
into a separate data structure, so would be much more amenable to offload
than PostgreSQL, in its current form.

Maybe look at SLRUs (clog etc) as a place to offload something?

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] MVCC overheads

2016-07-07 Thread Pete Stevenson
Hi Simon -

Thanks for the note. I think it's fair to say that I didn't provide enough
context, so let me try and elaborate on my question.

I agree, MVCC is a benefit. The research angle here is about enabling MVCC
with hardware offload. Since I didn't explicitly mention it, the offload I
refer to will respect all consistency guarantees also.

It is the case that for the database to implement MVCC it must provide
consistent read to multiple different versions of data, i.e. depending on
the version used at transaction start. I'm not an expert on postgresql
internals, but this must have some cost. I think the cost related to MVCC
guarantees can roughly be categorized as: creating new versions (linking
them in), version checking on read, garbage collecting old versions, and
then there is an additional cost that I am interested in (again not
claiming it is unnecessary in any sense) but there is a cost to generating
the log.

Thanks, by the way, for the warning about lab vs. reality. That's why I'm
asking this question here. I want to keep the hypothetical tagged as such,
but find defensible and realistic metrics where those exist, i.e. in this
instance, we do have a database that can use MVCC. It should be possible to
figure out how much work goes into maintaining that property.

Thank you,
Pete



On Thu, Jul 7, 2016 at 11:10 AM, Simon Riggs  wrote:

> On 7 July 2016 at 17:45, Pete Stevenson  wrote:
>
>> Hi postgresql hackers -
>>
>> I would like to find some analysis (published work, blog posts) on the
>> overheads affiliated with the guarantees provided by MVCC isolation. More
>> specifically, assuming the current workload is CPU bound (as opposed to IO)
>> what is the CPU overhead of generating the WAL, the overhead of version
>> checking and version creation, and of garbage collecting old and
>> unnecessary versions? For what it’s worth, I am working on a research
>> project where it is envisioned that some of this work can be offloaded.
>>
>
> MVCC is a benefit, not an overhead. To understand that you should compare
> MVCC with a system that performs S2PL.
>
> If you're thinking that somehow consistency isn't important, I'd hope that
> you also consider some way to evaluate the costs associated with
> inconsistent and incorrect results in applications, or other architectural
> restrictions imposed to make that possible. It's easy to make assumptions
> in the lab that don't work in the real world.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> 
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] MVCC overheads

2016-07-07 Thread Simon Riggs
On 7 July 2016 at 17:45, Pete Stevenson  wrote:

> Hi postgresql hackers -
>
> I would like to find some analysis (published work, blog posts) on the
> overheads affiliated with the guarantees provided by MVCC isolation. More
> specifically, assuming the current workload is CPU bound (as opposed to IO)
> what is the CPU overhead of generating the WAL, the overhead of version
> checking and version creation, and of garbage collecting old and
> unnecessary versions? For what it’s worth, I am working on a research
> project where it is envisioned that some of this work can be offloaded.
>

MVCC is a benefit, not an overhead. To understand that you should compare
MVCC with a system that performs S2PL.

If you're thinking that somehow consistency isn't important, I'd hope that
you also consider some way to evaluate the costs associated with
inconsistent and incorrect results in applications, or other architectural
restrictions imposed to make that possible. It's easy to make assumptions
in the lab that don't work in the real world.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


[HACKERS] MVCC overheads

2016-07-07 Thread Pete Stevenson
Hi postgresql hackers -

I would like to find some analysis (published work, blog posts) on the 
overheads affiliated with the guarantees provided by MVCC isolation. More 
specifically, assuming the current workload is CPU bound (as opposed to IO) 
what is the CPU overhead of generating the WAL, the overhead of version 
checking and version creation, and of garbage collecting old and unnecessary 
versions? For what it’s worth, I am working on a research project where it is 
envisioned that some of this work can be offloaded.

Thank you,
Pete Stevenson



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers