Re: Parallel grouping sets

2020-07-12 Thread Daniel Gustafsson
> On 25 Mar 2020, at 15:35, Pengzhou Tang wrote: > Thanks a lot, the patch has a memory leak in the lookup_hash_entries, it uses > a list_concat there > and causes a 64-byte leak for every tuple, has fixed that. > > Also, resolved conflicts and rebased the code. While there hasn't been a revie

Re: Parallel grouping sets

2020-03-23 Thread Tomas Vondra
On Fri, Mar 20, 2020 at 07:57:02PM +0800, Pengzhou Tang wrote: Hi Tomas, I rebased the code and resolved the comments you attached, some unresolved comments are explained in 0002-fixes.patch, please take a look. I also make the hash spill working for parallel grouping sets, the plan looks like

Re: Parallel grouping sets

2020-03-19 Thread Pengzhou Tang
and I also need to make hash spill work in the final stage of parallel grouping sets, will do that tomorrow. the conflicts mainly located in the handling of hash spill for grouping sets, the 0004-reorganise- patch also make the refilling the hash table stage easier and can avoid the

Re: Parallel grouping sets

2020-02-24 Thread Richard Guo
To summarize the current state of parallel grouping sets, we now have two available implementations for it. 1) Each worker performs an aggregation step, producing a partial result for each group of which that process is aware. Then the partial results are gathered to the leader, which then

Re: Parallel grouping sets

2020-02-09 Thread Pengzhou Tang
Thanks to reviewing those patches. Ha, I believe you meant to say a "normal aggregate", because what's > performed above gather is no longer "grouping sets", right? > > The group key idea is clever in that it helps "discriminate" tuples by > their grouping set id. I haven't completely thought this

Re: Parallel grouping sets

2020-02-03 Thread Jesse Zhang
On Mon, Feb 3, 2020 at 12:07 AM Richard Guo wrote: > > Hi Jesse, > > Thanks for reviewing these two patches. I enjoyed it! > > On Sat, Jan 25, 2020 at 6:52 AM Jesse Zhang wrote: >> >> >> I glanced over both patches. Just the opposite, I have a hunch that v3 >> is always better than v5. Here's my

Re: Parallel grouping sets

2020-02-03 Thread Richard Guo
Hi Amit, Thanks for reviewing these two patches. On Sat, Jan 25, 2020 at 6:31 PM Amit Kapila wrote: > > This is what I also understood after reading this thread. So, my > question is why not just review v3 and commit something on those lines > even though it would take a bit more time. It is

Re: Parallel grouping sets

2020-02-03 Thread Richard Guo
Hi Jesse, Thanks for reviewing these two patches. On Sat, Jan 25, 2020 at 6:52 AM Jesse Zhang wrote: > > I glanced over both patches. Just the opposite, I have a hunch that v3 > is always better than v5. Here's my 6-minute understanding of both. > > v5 (the one with a simple partial aggregate)

Re: Parallel grouping sets

2020-01-25 Thread Amit Kapila
emented according to different methods, which causes confusion. > > > > > > > Both the idea seems to be different. Is the second approach [1] > > inferior for any case as compared to the first approach? Can we keep > > both approaches for parallel grouping sets, if

Re: Parallel grouping sets

2020-01-24 Thread Jesse Zhang
the idea seems to be different. Is the second approach [1] > inferior for any case as compared to the first approach? Can we keep > both approaches for parallel grouping sets, if so how? If not, then > won't the code by the first approach be useless once we commit second >

Re: Parallel grouping sets

2020-01-23 Thread Amit Kapila
pared to the first approach? Can we keep both approaches for parallel grouping sets, if so how? If not, then won't the code by the first approach be useless once we commit second approach? [1] - https://www.postgresql.org/message-id/CAN_9JTwtTTnxhbr5AHuqVcriz3HxvPpx1JWE--DCSdJYuHrLtA

Re: Parallel grouping sets

2020-01-19 Thread Richard Guo
I realized that there are two patches in this thread that are implemented according to different methods, which causes confusion. So I decide to update this thread with only one patch, i.e. the patch for 'Implementation 1' as described in the first email and then move the other patch to a separate

Re: Parallel grouping sets

2020-01-07 Thread Richard Guo
On Sun, Dec 1, 2019 at 10:03 AM Michael Paquier wrote: > On Thu, Nov 28, 2019 at 07:07:22PM +0800, Pengzhou Tang wrote: > > Richard pointed out that he get incorrect results with the patch I > > attached, there are bugs somewhere, > > I fixed them now and attached the newest version, please refer

Re: Parallel grouping sets

2019-11-30 Thread Michael Paquier
On Thu, Nov 28, 2019 at 07:07:22PM +0800, Pengzhou Tang wrote: > Richard pointed out that he get incorrect results with the patch I > attached, there are bugs somewhere, > I fixed them now and attached the newest version, please refer to [1] for > the fix. Mr Robot is reporting that the latest pat

Re: Parallel grouping sets

2019-11-28 Thread Pengzhou Tang
e/parallel_groupingsets_3>_3 > > On Wed, Jul 31, 2019 at 4:07 PM Richard Guo wrote: > >> On Tue, Jul 30, 2019 at 11:05 PM Tomas Vondra < >> tomas.von...@2ndquadrant.com> wrote: >> >>> On Tue, Jul 30, 2019 at 03:50:32PM +0800, Richard Guo wrote: >>>

Re: Parallel grouping sets

2019-09-30 Thread Pengzhou Tang
Richard Guo wrote: >> >On Wed, Jun 12, 2019 at 10:58 AM Richard Guo wrote: >> > >> >> Hi all, >> >> >> >> Paul and I have been hacking recently to implement parallel grouping >> >> sets, and here we have two implementations. >&g

Re: Parallel grouping sets

2019-07-31 Thread Richard Guo
On Tue, Jul 30, 2019 at 11:05 PM Tomas Vondra wrote: > On Tue, Jul 30, 2019 at 03:50:32PM +0800, Richard Guo wrote: > >On Wed, Jun 12, 2019 at 10:58 AM Richard Guo wrote: > > > >> Hi all, > >> > >> Paul and I have been hacking recently to implement par

Re: Parallel grouping sets

2019-07-30 Thread Tomas Vondra
On Tue, Jul 30, 2019 at 03:50:32PM +0800, Richard Guo wrote: On Wed, Jun 12, 2019 at 10:58 AM Richard Guo wrote: Hi all, Paul and I have been hacking recently to implement parallel grouping sets, and here we have two implementations. Implementation 1 Attached is the patch

Re: Parallel grouping sets

2019-07-30 Thread Richard Guo
On Wed, Jun 12, 2019 at 10:58 AM Richard Guo wrote: > Hi all, > > Paul and I have been hacking recently to implement parallel grouping > sets, and here we have two implementations. > > Implementation 1 > > > Attached is the patch and also there is a

Re: Parallel grouping sets

2019-06-13 Thread Tomas Vondra
On Fri, Jun 14, 2019 at 12:02:52PM +1200, David Rowley wrote: On Fri, 14 Jun 2019 at 11:45, Tomas Vondra wrote: On Wed, Jun 12, 2019 at 10:58:44AM +0800, Richard Guo wrote: ># explain (costs off, verbose) select c1, c2, avg(c3) from t2 group by >grouping sets((c1,c2), (c1)); >

Re: Parallel grouping sets

2019-06-13 Thread David Rowley
On Fri, 14 Jun 2019 at 11:45, Tomas Vondra wrote: > > On Wed, Jun 12, 2019 at 10:58:44AM +0800, Richard Guo wrote: > ># explain (costs off, verbose) select c1, c2, avg(c3) from t2 group by > >grouping sets((c1,c2), (c1)); > > QUERY PLAN > >

Re: Parallel grouping sets

2019-06-13 Thread Tomas Vondra
On Wed, Jun 12, 2019 at 10:58:44AM +0800, Richard Guo wrote: Hi all, Paul and I have been hacking recently to implement parallel grouping sets, and here we have two implementations. Implementation 1 Attached is the patch and also there is a github branch [1] for this work

Re: Parallel grouping sets

2019-06-13 Thread Richard Guo
sult for each group. > > > > We are implementing parallel grouping sets in the same way. The only > > difference is that in the final stage, the leader performs a grouping > > sets aggregation, rather than a normal aggregation. > > Hi Richard, > > I think it was you an I

Re: Parallel grouping sets

2019-06-12 Thread David Rowley
which > that process is aware. Second, the partial results are transferred to > the leader via the Gather node. Finally, the leader merges the partial > results and produces the final result for each group. > > We are implementing parallel grouping sets in the same way. The only > d

Parallel grouping sets

2019-06-11 Thread Richard Guo
Hi all, Paul and I have been hacking recently to implement parallel grouping sets, and here we have two implementations. Implementation 1 Attached is the patch and also there is a github branch [1] for this work. Parallel aggregation has already been supported in PostgreSQL