Re: [HACKERS] asynchronous and vectorized execution

Konstantin Knizhnik Wed, 11 May 2016 07:18:29 -0700


On 11.05.2016 17:00, Robert Haas wrote:

On Tue, May 10, 2016 at 3:42 PM, Konstantin Knizhnik
<[email protected]> wrote:

Doesn't this actually mean that we need to have normal job scheduler which
is given queue of jobs and having some pool of threads will be able to
orginize efficient execution of queries? Optimizer can build pipeline
(graph) of tasks, which corresponds to execution plan nodes, i.e. SeqScan,
Sort, ... Each task is splitted into several jobs which can be concurretly
scheduled by task dispatcher.  So you will not have blocked worker waiting
for something and all system resources will be utilized. Such approach with
dispatcher allows to implement quotas, priorities,... Also dispatches can
care about NUMA and cache optimizations which is especially critical on
modern architectures. One more reference:
http://db.in.tum.de/~leis/papers/morsels.pdf

I read this as a proposal to redesign the entire optimizer and
executor to use some new kind of plan.  That's not a project I'm
willing to entertain; it is hard to imagine we could do it in a
reasonable period of time without introducing bugs and performance
regressions.  I think there is a great deal of performance benefit
that we can get by changing things incrementally.

Yes, I agree with you that complete rewriting of optimizer is hugeproject with unpredictable influence on performance of some queries.Changing things incrementally is good approach, but only if we aremoving in right direction.I still not sure that introduction of async. operations is step in rightdirection. Async.ops are used to significantly complicate code (sinceyou have to maintain state yourself). It will be bad if implementationof each node has to deal with async state itself in its own manner.

My suggestion is to try to provide some generic mechanism for managingstate transition and have some scheduler which controls this process. Itshould not be responsibility of node implementation to organizeasynchronous/parallel execution. Instead of this it should just produceset of jobs which execution should be controlled by scheduler. Firstimplementation of scheduler can be quite simple. But later in can becomemore clever: try to bind data to processors and do many otheroptimizations.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous and vectorized execution

Reply via email to