> -----Original Messages-----
> From: "Kevin Grittner" <kgri...@gmail.com>
> Sent Time: 2017-03-12 04:24:29 (Sunday)
> To: "Mengxing Liu" <liu-m...@mails.tsinghua.edu.cn>
> Cc: "email@example.com" <firstname.lastname@example.org>
> Subject: Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in
> serializable transactions
> On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu
> <liu-m...@mails.tsinghua.edu.cn> wrote:
> > My name is Mengxing Liu. I am interested in the project "Eliminate
> > O(N^2) scaling from rw-conflict tracking in serializable
> > transactions”. After discussing with Kevin off-list, I think it's
> > time to post discussion here. I am afraid that there are two things
> > that I need your help. Thank you very much.
> Welcome to the -hackers list! This is a key part of how development
> happens in the community. Don't be shy about posting questions and
> ideas, but also expect fairly blunt discussion to occur at times;
> don't let that put you off.
Thanks, Kevin. I will keep that in mind.
> >>> So the task is clear. We can use a tree-like or hash-like data
> >>> structure to speed up this function.
> >> Right -- especially with a large number of connections holding a
> >> large number of conflicts. In one paper with high concurrency, they
> >> found over 50% of the CPU time for PostgreSQL was going to these
> >> functions (including functions called by them). This seems to me to
> >> be due to the O(N^2) (or possibly worse) performance from the number
> >> of connections.
> > Anyone knows the title of this paper? I want to reproduce its
> > workloads.
> I seem to remember there being a couple other papers or talks, but
> this is probably the most informative:
Thanks, It's exactly what I need.
> >> Remember, I think most of the work here is going to be in
> >> benchmarking. We not only need to show improvements in simple test
> >> cases using readily available tools like pgbench, but think about
> >> what types of cases might be worst for the approach taken and show
> >> that it still does well -- or at least not horribly. It can be OK
> >> to have some slight regression in an unusual case if the common
> >> cases improve a lot, but any large regression needs to be addressed
> >> before the patch can be accepted. There are some community members
> >> who are truly diabolical in their ability to devise "worst case"
> >> tests, and we don't want to be blind-sided by a bad result from one
> >> of them late in the process.
> > Are there any documents or links introducing how to test and
> > benchmark PostgreSQL? I may need some time to learn about it.
> There is pgbench:
> A read-only load and a read/write mix should both be tested,
> probably with a few different scales and client counts to force the
> bottleneck to be in different places. The worst problems have been
> seen with 32 or more cores on 4 or more sockets with a large number
> of active connections. I don't know whether you have access to a
> machine capable of putting this kind of stress on it (perhaps at
> your university?), but if not, the community has access to various
> resources we should be able to schedule time on.
There is a NUMA machine ( 120 cores, 8 sockets) in my lab. I think it's enough
to put this kind of stress.
Anyway, I would ask for help if necessary.
> It may pay for you to search through the archives of the last year
> or two to look for other benchmarks and see what people have
> previously done.
I would try.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: