Re: parallel distinct union and aggregate support patch

2021-11-08 Thread Daniel Gustafsson
> On 29 Mar 2021, at 15:36, David Steele wrote: > A rebase is also required so marked Waiting for Author. Many months on and this patch still needs a rebase to apply, and the thread has stalled. I'm marking this Returned with Feedback. Please feel free to open a new entry if you return to

Re: Re: parallel distinct union and aggregate support patch

2021-07-21 Thread bu...@sohu.com
naka; robertmhaas; pgsql Subject: Re: Re: parallel distinct union and aggregate support patch On Tue, 30 Mar 2021 at 22:33, bu...@sohu.com wrote: > I have written a plan with similar functions, It is known that the following > two situations do not work well. I read through this thread an

Re: Re: parallel distinct union and aggregate support patch

2021-07-05 Thread David Rowley
On Tue, 30 Mar 2021 at 22:33, bu...@sohu.com wrote: > I have written a plan with similar functions, It is known that the following > two situations do not work well. I read through this thread and also wondered about a Parallel Partition type operator. It also seems to me that if it could be

Re: Re: parallel distinct union and aggregate support patch

2021-03-30 Thread bu...@sohu.com
> This patch has not gotten any review in the last two CFs and is unlikely > to be committed for PG14 so I have moved it to the 2021-07 CF. A rebase > is also required so marked Waiting for Author. > > I can see this is a work in progress, but you may want to consider the > several suggestions

Re: parallel distinct union and aggregate support patch

2021-03-29 Thread David Steele
On 1/25/21 9:14 AM, bu...@sohu.com wrote: Now, I rewrite batch hashagg and sort, add some comment and combin too patches. base on master 2ad78a87f018260d4474eee63187e1cc73c9b976. They are support rescan and change GUC enable_batch_hashagg/enable_batch_sort to

Re: Re: parallel distinct union and aggregate support patch

2020-11-30 Thread bu...@sohu.com
> 1. > +#define BATCH_SORT_MAX_BATCHES 512 > > Did you decide this number based on some experiment or is there some > analysis behind selecting this number? When there are too few batches, if a certain process works too slowly, it will cause unbalanced load. When there are too many batches, FD

Re: parallel distinct union and aggregate support patch

2020-11-28 Thread Dilip Kumar
On Fri, Nov 27, 2020 at 9:25 PM Heikki Linnakangas wrote: > > I also had a quick look at the patch and the comments made so far. Summary: > > 1. The performance results are promising. > > 2. The code needs comments. > > Regarding the design: > > Thomas Munro mentioned the idea of a "Parallel

Re: parallel distinct union and aggregate support patch

2020-11-28 Thread Robert Haas
On Fri, Nov 27, 2020 at 10:55 AM Heikki Linnakangas wrote: > I think a non-buffering Reparttion node would be simpler, and thus > better. In these patches, you have a BatchSort node, and batchstore, but > a simple Parallel Repartition node could do both. For example, to > implement distinct: > >

Re: parallel distinct union and aggregate support patch

2020-11-27 Thread Heikki Linnakangas
I also had a quick look at the patch and the comments made so far. Summary: 1. The performance results are promising. 2. The code needs comments. Regarding the design: Thomas Munro mentioned the idea of a "Parallel Repartition" node that would redistribute tuples like this. As I understand

Re: Re: parallel distinct union and aggregate support patch

2020-11-17 Thread Dilip Kumar
On Sun, Nov 8, 2020 at 11:54 AM Dilip Kumar wrote: > > On Tue, Nov 3, 2020 at 6:06 PM Dilip Kumar wrote: > > > > On Thu, Oct 29, 2020 at 12:53 PM bu...@sohu.com wrote: > > > > > > > 1) It's better to always include the whole patch series - including the > > > > parts that have not changed.

Re: Re: parallel distinct union and aggregate support patch

2020-11-07 Thread Dilip Kumar
On Tue, Nov 3, 2020 at 6:06 PM Dilip Kumar wrote: > > On Thu, Oct 29, 2020 at 12:53 PM bu...@sohu.com wrote: > > > > > 1) It's better to always include the whole patch series - including the > > > parts that have not changed. Otherwise people have to scavenge the > > > thread and search for all

Re: Re: parallel distinct union and aggregate support patch

2020-11-03 Thread Dilip Kumar
On Thu, Oct 29, 2020 at 12:53 PM bu...@sohu.com wrote: > > > 1) It's better to always include the whole patch series - including the > > parts that have not changed. Otherwise people have to scavenge the > > thread and search for all the pieces, which may be a source of issues. > > Also, it

Re: Re: parallel distinct union and aggregate support patch

2020-10-29 Thread bu...@sohu.com
> 1) It's better to always include the whole patch series - including the > parts that have not changed. Otherwise people have to scavenge the > thread and search for all the pieces, which may be a source of issues. > Also, it confuses the patch tester [1] which tries to apply patches from > a

Re: parallel distinct union and aggregate support patch

2020-10-28 Thread Tomas Vondra
Hi, On Wed, Oct 28, 2020 at 05:37:40PM +0800, bu...@sohu.com wrote: Hi Here is patch for parallel distinct union aggregate and grouping sets support using batch hash agg. Please review. how to use: set enable_batch_hashagg = on how to work: like batch sort, but not sort each batch, just save

Re: parallel distinct union and aggregate support patch

2020-10-28 Thread bu...@sohu.com
Hi Here is patch for parallel distinct union aggregate and grouping sets support using batch hash agg. Please review. how to use: set enable_batch_hashagg = on how to work: like batch sort, but not sort each batch, just save hash value in each rows unfinished work: not support rescan yet.

Re: Re: parallel distinct union and aggregate support patch

2020-10-27 Thread bu...@sohu.com
> On Tue, Oct 27, 2020 at 3:27 PM Dilip Kumar wrote: > > > > On Fri, Oct 23, 2020 at 11:58 AM bu...@sohu.com wrote: > > > > > > > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > > > > will directly put it into the respective batch(shared tuple store), > > > > based on

Re: parallel distinct union and aggregate support patch

2020-10-27 Thread Dilip Kumar
On Tue, Oct 27, 2020 at 5:43 PM Robert Haas wrote: > > On Thu, Oct 22, 2020 at 5:08 AM Dilip Kumar wrote: > > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > > will directly put it into the respective batch(shared tuple store), > > based on the hash on grouping column

Re: Re: parallel distinct union and aggregate support patch

2020-10-27 Thread Dilip Kumar
On Tue, Oct 27, 2020 at 3:27 PM Dilip Kumar wrote: > > On Fri, Oct 23, 2020 at 11:58 AM bu...@sohu.com wrote: > > > > > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > > > will directly put it into the respective batch(shared tuple store), > > > based on the hash on

Re: parallel distinct union and aggregate support patch

2020-10-27 Thread Robert Haas
On Thu, Oct 22, 2020 at 5:08 AM Dilip Kumar wrote: > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > will directly put it into the respective batch(shared tuple store), > based on the hash on grouping column and once all the workers are > doing preparing the batch then

Re: Re: parallel distinct union and aggregate support patch

2020-10-27 Thread Dilip Kumar
On Fri, Oct 23, 2020 at 11:58 AM bu...@sohu.com wrote: > > > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > > will directly put it into the respective batch(shared tuple store), > > based on the hash on grouping column and once all the workers are > > doing preparing the

Re: Re: parallel distinct union and aggregate support patch

2020-10-23 Thread bu...@sohu.com
> Interesting idea. So IIUC, whenever a worker is scanning the tuple it > will directly put it into the respective batch(shared tuple store), > based on the hash on grouping column and once all the workers are > doing preparing the batch then each worker will pick those baches one > by one,

Re: Re: parallel distinct union and aggregate support patch

2020-10-22 Thread bu...@sohu.com
plan could break", mean some on path using this path? no, BathSortPath on for some special path(Unique, GroupAgg ...). bu...@sohu.com From: Thomas Munro Date: 2020-10-21 12:27 To: bu...@sohu.com CC: pgsql-hackers Subject: Re: parallel distinct union and aggregate support patch On Tue

Re: parallel distinct union and aggregate support patch

2020-10-22 Thread Dilip Kumar
On Mon, Oct 19, 2020 at 8:19 PM bu...@sohu.com wrote: > > Hi hackers, > I write a path for soupport parallel distinct, union and aggregate using > batch sort. > steps: > 1. generate hash value for group clauses values, and using mod hash value > save to batch > 2. end of outer plan, wait all

Re: parallel distinct union and aggregate support patch

2020-10-20 Thread Thomas Munro
On Tue, Oct 20, 2020 at 3:49 AM bu...@sohu.com wrote: > I write a path for soupport parallel distinct, union and aggregate using > batch sort. > steps: > 1. generate hash value for group clauses values, and using mod hash value > save to batch > 2. end of outer plan, wait all other workers