On Mon, Oct 17, 2016 at 8:36 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> This project of mine is about parallelizing tuplesort.c, which isn't >> really what you want for parallel query -- you shouldn't try to scope >> the problem as "make the sort more scalable using parallelism" there. >> Rather, you want to scope it at "make the execution of the entire >> query more scalable using parallelism", which is really quite a >> different thing, which necessarily involves the executor having direct >> knowledge of partition boundaries. > > Okay, but what is the proof or why do you think second is going to > better than first? One thing which strikes as a major difference > between your approach and Gather Merge is that in your approach leader > has to wait till all the workers have done with their work on sorting > whereas with Gather Merge as soon as first one is done, leader starts > with merging. I could be wrong here, but if I understood it > correctly, then there is a argument that Gather Merge kind of approach > can win in cases where some of the workers can produce sorted outputs > ahead of others and I am not sure if we can dismiss such cases.
Gather Merge can't emit a tuple unless it has buffered at least one tuple from every producer; otherwise, the next tuple it receives from one of those producers might proceed whichever tuple it chooses to emit. However, it doesn't need to wait until all of the workers are completely done. The leader only needs to be at least slightly ahead of the slowest worker. I'm not sure how that compares to Peter's approach. What I'm worried about is that we're implementing two separate systems to do the same thing, and that the parallel sort approach is actually a lot less general. I think it's possible to imagine a Parallel Sort implementation which does things Gather Merge can't. If all of the workers collaborate to sort all of the data rather than each worker sorting its own data, then you've got something which Gather Merge can't match. But this is not that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers