Re: [pgsql-students] [HACKERS] [GSoC] Push-based query executor discussion

2017-04-06 Thread Simon Riggs
On 22 March 2017 at 14:58, Oleg Bartunov  wrote:

> Should we reject this interesting project, which based on several years of
> research work of academician group in the institute ? May be better help him
> to reformulate the scope of project and let him work ? I don't know exactly
> if the results of GSoC project should be committed , but as a research
> project it's certainly would be useful for the community.

+1

Arseny, thank you for your contributions.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-06 Thread Kevin Grittner
Sorry, I didn't notice that this was going to a public list.  That URL
is only available to people who signed up as mentors for PostgreSQL
GSoC participation this year.  Does the link to the draft work for you?

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-06 Thread Tom Lane
Kevin Grittner  writes:
> Note that the final proposal is here:
> https://summerofcode.withgoogle.com/serve/5874530240167936/

I'm just getting a blank page at that URL?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-06 Thread Kevin Grittner
On Thu, Apr 6, 2017 at 8:11 AM, Alexander Korotkov
 wrote:

>> https://docs.google.com/document/d/1dvBETE6IJA9AcXd11XJNPsF_VPcDhSjy7rlsxj262l8/edit?usp=sharing

> I'd love to see a comment from Andres Freund who is leading executor
> performance improvements.

Note that the final proposal is here:

https://summerofcode.withgoogle.com/serve/5874530240167936/

Also, I just entered a comment about an important question that I
think needs to be answered right up front.

-- 
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-06 Thread Alexander Korotkov
On Sun, Apr 2, 2017 at 12:13 AM, Arseny Sher  wrote:

> Time is short, student's application deadline is on 3rd April. I decided
> to reformulate the project scope myself. Here is the proposal:
>
> https://docs.google.com/document/d/1dvBETE6IJA9AcXd11XJNPsF_
> VPcDhSjy7rlsxj262l8/edit?usp=sharing
>
> The main idea is that now there is a formalized goal of the project,
> "partial support of all TPC-H queries".
>
> I am also CC'ing people who was mentioned in "Potential Mentors" section
> on GSoC wiki page.
>

I'd love to see a comment from Andres Freund who is leading executor
performance improvements.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-03 Thread Arseny Sher
Time is short, student's application deadline is on 3rd April. I decided
to reformulate the project scope myself. Here is the proposal:

https://docs.google.com/document/d/1dvBETE6IJA9AcXd11XJNPsF_VPcDhSjy7rlsxj262l8/edit?usp=sharing

The main idea is that now there is a formalized goal of the project,
"partial support of all TPC-H queries".

I am also CC'ing people who was mentioned in "Potential Mentors" section
on GSoC wiki page.


--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-23 Thread sher-ars

Oleg Bartunov  writes:

I don't know exactly if the results of GSoC project should be 
committed,


Technically, they are not required:
https://developers.google.com/open-source/gsoc/faq


Are mentoring organizations required to use the code produced by
students?



No. While we hope that all the code that comes out of this program will
find a happy home, we don’t require organizations to use the student's'
code.



--
Arseny Sher



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-22 Thread Oleg Bartunov
On Wed, Mar 22, 2017 at 8:04 PM, Arseny Sher  wrote:

> > While I admire your fearlessness, I think the chances of you being
> > able to bring a project of this type to a successful conclusion are
> > remote.  Here is what I said about this topic previously:
> >
> > http://postgr.es/m/CA+Tgmoa=kzHJ+TwxyQ+vKu21nk3prkRjSdbhjubN7qvc8UKuG
> g...@mail.gmail.com
>
> Well, as I said, I don't pretend that I will support full functionality:
> >> instead, we should decide which part of this work (if any) is
> >> going to be done in the course of GSoC. Probably, all TPC-H queries
> >> with and without index support is a good initial target, but this
> >> needs to be discussed.
>
> I think that successfull completion of this project should be a clear
> and justified answer to the question "Is this idea is good enough to
> work on merging it into the master?", not the production-ready patches
> themselves. Nevertheless, of course project success criterion must be
> reasonably formalized -- e.g. implement nodes X with features Y, etc.
>

How many GSoC slots and possible students we have ?

Should we reject this interesting project, which based on several years of
research work of academician group in the institute ? May be better help
him to reformulate the scope of project and let him work ? I don't know
exactly if the results of GSoC project should be committed , but as a
research project it's certainly would be useful for the community.


>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-22 Thread Arseny Sher
> While I admire your fearlessness, I think the chances of you being
> able to bring a project of this type to a successful conclusion are
> remote.  Here is what I said about this topic previously:
>
> http://postgr.es/m/CA+Tgmoa=kzhj+twxyq+vku21nk3prkrjsdbhjubn7qvc8uk...@mail.gmail.com

Well, as I said, I don't pretend that I will support full functionality:
>> instead, we should decide which part of this work (if any) is
>> going to be done in the course of GSoC. Probably, all TPC-H queries
>> with and without index support is a good initial target, but this
>> needs to be discussed.

I think that successfull completion of this project should be a clear
and justified answer to the question "Is this idea is good enough to
work on merging it into the master?", not the production-ready patches
themselves. Nevertheless, of course project success criterion must be
reasonably formalized -- e.g. implement nodes X with features Y, etc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-08 Thread Robert Haas
On Mon, Mar 6, 2017 at 11:20 AM, Arseny Sher  wrote:
> I would like to work on push-based executor [1] during GSoC, so I'm
> writing to introduce myself and start the discussion of the project. I
> think I should mention beforehand that the subject is my master's
> thesis topic, and I have already started working on it. This letter is
> not (obviously) a ready proposal but rather initial point to talk over
> the concept. Below you can see a short review of the idea, description
> of benefits for the community, details, related work and some info
> about me.

While I admire your fearlessness, I think the chances of you being
able to bring a project of this type to a successful conclusion are
remote.  Here is what I said about this topic previously:

http://postgr.es/m/CA+Tgmoa=kzhj+twxyq+vku21nk3prkrjsdbhjubn7qvc8uk...@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [GSoC] Push-based query executor discussion

2017-03-06 Thread Arseny Sher
Hello,

I would like to work on push-based executor [1] during GSoC, so I'm
writing to introduce myself and start the discussion of the project. I
think I should mention beforehand that the subject is my master's
thesis topic, and I have already started working on it. This letter is
not (obviously) a ready proposal but rather initial point to talk over
the concept. Below you can see a short review of the idea, description
of benefits for the community, details, related work and some info
about me.


*Brief review*
The idea is described at the wiki page [1] and in the letter [2]. I
propose to replace current ExecProcNode interface between execution
nodes with function called, say, pushTuple that pushes the ready tuple
to the current node's parent.


*Benefits for the community*
Why would we want this? In general, because Postgres executor is slow
for CPU-bound queries and this approach should accelerate it. [4] and
[5] argue that this model results in better code and data locality,
and that JIT compilation makes the difference even more drastic.

Besides, while working on this, in order to study the effects of model
change I will try to investigate the Postgres executor's performance
in both models extensively. For instance, it is commonly accepted that
current Volcano-style model leads to poor usage of modern CPUs
pipelining abilities and large percent of branch mispredictions. I am
going to see whether, where and when this is true in Postgres;
profiling results should be useful for any further executor
optimizations.


*Project details*
Technically, I am planning to implement this as follows. Common for
all nodes code which needs to be changed is in execMain.c and
execProcnode.c; standard_ExecutorRun in execMain.c now should start
execution of all leaf nodes in proper order instead of pulling tuples
one-by-one from top-level node. By 'proper' order here I mean that
inner nodes will be run first, outer nodes second, so that when the
first tuple from outer side of some node arrives to it, the node
already received all its tuples from the inner side.

How we 'start' execution of a leaf? Recall that now instead of
ExecProcNode we have pushTuple function with following signature:

bool pushTuple(TupleTableSlot *slot, PlanState *node, PlanState *pusher)

'slot' is the tuple we push. 'node' is a receiver of tuple, 'pusher'
is sender of the tuple, its parent is 'node'. We need 'pusher' only to
distinguish inner and outer pushes. This function returns true if
'node' is still accepting tuples after the push, false if not,
e.g. Limit node can return false after required number of tuples were
passed. We also add the convention that when a node has nothing to
push anymore, it calls pushTuple with slot=NULL to let parent know
that it is done. So, to start execution of a leaf,
standard_ExecutorRun basically needs to call pushTuple(NULL, leaf,
NULL) once. Leaf nodes are a special case because pusher=NULL; another
obvious special case is top-level node: it calls pushTuple(slot, NULL,
node), such call will push the slot to the destination
((*dest->receiveSlot) (slot, dest) in current code).

Like ExecProcNode, pushTuple will call the proper implementation, e.g.
pushTupleToLimit. Such implementations will contain the code similar
to its analogue (e.g. ExecLimit), but, very roughly, where we have

return slot;

now, in push model we will have

bool parent_accepts_tuples = pushTuple(slot, node->parent, node);

and then we will continue execution if parent_accepts_tuples is true
or exit if not.

Complex nodes require more complicated modifications to preserve the
correct behaviour and be efficient. The latter leads to some
architectural issues: for example, efficient SeqScan should call
pushTuple from function doing similar to what heapgettups_pagemode
currently does, otherwise, we would need to save/restore its state
(lines, page, etc) every time for each tuple. On the other hand, it is
not nice to call pushTuple from there because currently access level
(heapam.c) knows nothing about PlanStates. Such issues will need to be
addressed and discussed with the community.

Currently, I have a prototype (pretty much WIP) which implements this
model for SeqScan, Limit, Hash and Hashjoin nodes.

Since TPC-H benchmarks are de facto standard to evaluate such things,
I am planning to to use them for testing. BTW, I’ve written a couple
of scripts to automate this job [16], although it seems that everyone
who tests TPC-H ends up with writing his own version.

Now, it is clear that rewriting all nodes with full support in such a
manner is huge work. Besides, we still don't know quantitative profit
of this model.  Because of that, I do not propose any timeline right
now; instead, we should decide which part of this work (if any) is
going to be done in the course of GSoC. Probably, all TPC-H queries
with and without index support is a good initial target, but this
needs to be discussed. Anyway, I don't think that the result will be a
patch