Re: [HACKERS] MVCC overheads

2016-07-08 Thread Pete Stevenson
Good info, thanks for the note. Agreed that it is difficult to pull things 
apart to isolate these features for offload — so actually running experiments 
with offload is not possible, as you point out (and for other reasons).

Maybe I could figure out the lines of code that add versions into a table and 
then those that collect old versions (they do get collected, right?). Anyway, 
thought being I could profile while running TPC-C or similar. I was hoping that 
someone might be able to jump on this with a response that they already did 
something similar. I know that Stonebraker has done some analysis along these 
lines, but I’m looking for an independent result that confirms (or not) his 
work.

Thank you,
Pete Stevenson


> On Jul 7, 2016, at 3:43 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
> 
> On 7 July 2016 at 20:50, Pete Stevenson <etep.nosnev...@gmail.com 
> <mailto:etep.nosnev...@gmail.com>> wrote:
> Hi Simon -
> 
> Thanks for the note. I think it's fair to say that I didn't provide enough 
> context, so let me try and elaborate on my question.
> 
> I agree, MVCC is a benefit. The research angle here is about enabling MVCC 
> with hardware offload. Since I didn't explicitly mention it, the offload I 
> refer to will respect all consistency guarantees also.
> 
> It is the case that for the database to implement MVCC it must provide 
> consistent read to multiple different versions of data, i.e. depending on the 
> version used at transaction start. I'm not an expert on postgresql internals, 
> but this must have some cost. I think the cost related to MVCC guarantees can 
> roughly be categorized as: creating new versions (linking them in), version 
> checking on read, garbage collecting old versions, and then there is an 
> additional cost that I am interested in (again not claiming it is unnecessary 
> in any sense) but there is a cost to generating the log.
> 
> Thanks, by the way, for the warning about lab vs. reality. That's why I'm 
> asking this question here. I want to keep the hypothetical tagged as such, 
> but find defensible and realistic metrics where those exist, i.e. in this 
> instance, we do have a database that can use MVCC. It should be possible to 
> figure out how much work goes into maintaining that property.
> 
> PostgreSQL uses a no overwrite storage mechanism, so any additional row 
> versions are maintained in the same table alongside other rows. The MVCC 
> actions are mostly mixed in with other aspects of the storage, so not 
> isolated much for offload.
> 
> Oracle has a different mechanism that does isolate changed row versions into 
> a separate data structure, so would be much more amenable to offload than 
> PostgreSQL, in its current form.
> 
> Maybe look at SLRUs (clog etc) as a place to offload something?
> 
> -- 
> Simon Riggshttp://www.2ndQuadrant.com/ 
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MVCC overheads

2016-07-07 Thread Pete Stevenson
Hi Simon -

Thanks for the note. I think it's fair to say that I didn't provide enough
context, so let me try and elaborate on my question.

I agree, MVCC is a benefit. The research angle here is about enabling MVCC
with hardware offload. Since I didn't explicitly mention it, the offload I
refer to will respect all consistency guarantees also.

It is the case that for the database to implement MVCC it must provide
consistent read to multiple different versions of data, i.e. depending on
the version used at transaction start. I'm not an expert on postgresql
internals, but this must have some cost. I think the cost related to MVCC
guarantees can roughly be categorized as: creating new versions (linking
them in), version checking on read, garbage collecting old versions, and
then there is an additional cost that I am interested in (again not
claiming it is unnecessary in any sense) but there is a cost to generating
the log.

Thanks, by the way, for the warning about lab vs. reality. That's why I'm
asking this question here. I want to keep the hypothetical tagged as such,
but find defensible and realistic metrics where those exist, i.e. in this
instance, we do have a database that can use MVCC. It should be possible to
figure out how much work goes into maintaining that property.

Thank you,
Pete



On Thu, Jul 7, 2016 at 11:10 AM, Simon Riggs <si...@2ndquadrant.com> wrote:

> On 7 July 2016 at 17:45, Pete Stevenson <etep.nosnev...@gmail.com> wrote:
>
>> Hi postgresql hackers -
>>
>> I would like to find some analysis (published work, blog posts) on the
>> overheads affiliated with the guarantees provided by MVCC isolation. More
>> specifically, assuming the current workload is CPU bound (as opposed to IO)
>> what is the CPU overhead of generating the WAL, the overhead of version
>> checking and version creation, and of garbage collecting old and
>> unnecessary versions? For what it’s worth, I am working on a research
>> project where it is envisioned that some of this work can be offloaded.
>>
>
> MVCC is a benefit, not an overhead. To understand that you should compare
> MVCC with a system that performs S2PL.
>
> If you're thinking that somehow consistency isn't important, I'd hope that
> you also consider some way to evaluate the costs associated with
> inconsistent and incorrect results in applications, or other architectural
> restrictions imposed to make that possible. It's easy to make assumptions
> in the lab that don't work in the real world.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


[HACKERS] MVCC overheads

2016-07-07 Thread Pete Stevenson
Hi postgresql hackers -

I would like to find some analysis (published work, blog posts) on the 
overheads affiliated with the guarantees provided by MVCC isolation. More 
specifically, assuming the current workload is CPU bound (as opposed to IO) 
what is the CPU overhead of generating the WAL, the overhead of version 
checking and version creation, and of garbage collecting old and unnecessary 
versions? For what it’s worth, I am working on a research project where it is 
envisioned that some of this work can be offloaded.

Thank you,
Pete Stevenson



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers