On 2015-05-16 00:06:12 +0200, Andres Freund wrote:
Andrew (and I) have been working on this since. Here's the updated and
rebased patch.
It misses a decent commit message and another beautification
readthrough. I've spent the last hour going through the thing again and
all I hit was a
On 2015-05-14 09:16:10 +0100, Andrew Gierth wrote:
Andres A rough sketch of what I'm thinking of is:
I'm not sure I'd do it quite like that.
It was meant as a sketch, so there's lots of things it's probably
missing ;)
Rather, have a wrapper function get_outer_tuple that calls
ExecProcNode
On Thu, May 14, 2015 at 08:59:45AM +0200, Andres Freund wrote:
On 2015-05-14 02:51:42 -0400, Noah Misch wrote:
Covering hash aggregation might entail a large preparatory refactoring
of nodeHash.c, but beyond development cost I can't malign that.
You mean execGrouping.c? Afaics nodeHash.c
On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
I still believe that the general approach of chaining vs. a union or CTE
is correct due to the efficiency arguments upthread. My problem is
that, unless I very much misunderstand something, the current
implementation can end up
On Thu, May 14, 2015 at 08:38:07AM +0200, Andres Freund wrote:
On 2015-05-14 02:32:04 -0400, Noah Misch wrote:
On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
Andrew, is that a structure you could live with, or not?
Others, what do you think?
Andrew and I discussed
On 2015-05-14 02:51:42 -0400, Noah Misch wrote:
Covering hash aggregation might entail a large preparatory refactoring
of nodeHash.c, but beyond development cost I can't malign that.
You mean execGrouping.c? Afaics nodeHash.c isn't involved, and it
doesn't look very interesting to make it so?
On 2015-05-14 02:32:04 -0400, Noah Misch wrote:
On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
Andrew, is that a structure you could live with, or not?
Others, what do you think?
Andrew and I discussed that very structure upthread:
Andres == Andres Freund and...@anarazel.de writes:
Andres My problem is that, unless I very much misunderstand something,
Andres the current implementation can end up requiring roughly #sets *
Andres #input of additional space for the sidechannel tuplestore in
Andres some bad cases. That
On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
Another controversial item was the introduction of GroupedVar. The need
for this can be avoided by explicitly setting to NULL the relevant
columns of the representative group tuple when evaluating result rows,
but (a) I don't think that's
On 2015-05-13 22:51:15 +0200, Andres Freund wrote:
I'm pretty sure by now that I dislike the introduction of GroupedVar,
and not just tentatively. While I can see why you found its
introduction to be nicer than fiddling with the result tuple, for me the
disadvantages seem to outweigh the
Andres == Andres Freund and...@anarazel.de writes:
Andres Andrew, are you going to be working on any of these?
As discussed on IRC, current status is:
* The increased complexity of grouping_planner. It'd imo be good if some
of that could be refactored into a separate function.
On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
What I dislike so far:
* Minor formatting things. Just going to fix and push the ones I
dislike.
* The Hopcroft-Karp stuff not being separate
* The increased complexity of grouping_planner. It'd imo be good if some
of that could be
On 2015-05-12 20:40:49 +0200, Andres Freund wrote:
On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
What I dislike so far:
* Minor formatting things. Just going to fix and push the ones I
dislike.
* The Hopcroft-Karp stuff not being separate
* The increased complexity of
On 2015-04-30 05:35:26 +0100, Andrew Gierth wrote:
Andres == Andres Freund and...@anarazel.de writes:
Andres This is not a real review. I'm just scanning through the
Andres patch, without reading the thread, to understand if I see
Andres something worthy of controversy. While scanning I
I think the problem is just that for each variable, in each grouping
set - a very large number in a large cube - we're recursing through the
whole ChainAggregate tree, as each Var just points to a var one level
lower.
For small values of very large, that is. Had a little thinko there. Its still
On Thu, Apr 30, 2015 at 05:35:26AM +0100, Andrew Gierth wrote:
Andres == Andres Freund and...@anarazel.de writes:
+ * TODO: AGG_HASHED doesn't support multiple grouping sets yet.
Andres Are you intending to resolve this before an eventual commit?
Original plan was to tackle
Hi,
This is not a real review. I'm just scanning through the patch, without
reading the thread, to understand if I see something worthy of
controversy. While scanning I might have a couple observations or
questions.
On 2015-03-13 15:46:15 +, Andrew Gierth wrote:
+ * A list of
Andres == Andres Freund and...@anarazel.de writes:
Andres This is not a real review. I'm just scanning through the
Andres patch, without reading the thread, to understand if I see
Andres something worthy of controversy. While scanning I might have
Andres a couple observations or questions.
On Fri, Jan 02, 2015 at 03:55:23PM -0600, Jim Nasby wrote:
On 12/31/14, 3:05 PM, Noah Misch wrote:
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
Noah == Noah Mischn...@leadboat.com writes:
Noah Suppose one node orchestrated all sorting and aggregation.
Well, that has
On 12/31/14, 3:05 PM, Noah Misch wrote:
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
Noah == Noah Mischn...@leadboat.com writes:
Noah Suppose one node orchestrated all sorting and aggregation.
Well, that has the downside of making it into an opaque blob, without
On Tue, Dec 23, 2014 at 02:29:58AM -0500, Noah Misch wrote:
On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote:
I still find the ChainAggregate approach too ugly at a system structural
level to accept, regardless of Noah's argument about number of I/O cycles
consumed. We'll be paying
ChainAggregate is
a bit like a node having two parents, a Sort and a GroupAggregate.
However,
the graph edge between ChainAggregate and its GroupAggregate is a
tuplestore
instead of the usual, synchronous ExecProcNode().
Well, I dont buy the two parents theory. The Sort nodes are
Noah == Noah Misch n...@leadboat.com writes:
Noah Suppose one node orchestrated all sorting and aggregation.
Well, that has the downside of making it into an opaque blob, without
actually gaining much.
Noah Call it a MultiGroupAggregate for now. It wouldn't harness
Noah Sort nodes, because
On Wed, Dec 31, 2014 at 02:45:29PM +0530, Atri Sharma wrote:
Suppose one node orchestrated all sorting and aggregation. Call it a
MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it
performs aggregation between tuplesort_puttupleslot() calls. Instead, it
would
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
Noah == Noah Misch n...@leadboat.com writes:
Noah Suppose one node orchestrated all sorting and aggregation.
Well, that has the downside of making it into an opaque blob, without
actually gaining much.
The opaque-blob
On Mon, Dec 22, 2014 at 6:57 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
In the case of cube(a,b,c,d), our code currently gives:
b,d,a,c: (b,d,a,c),(b,d)
a,b,d:(a,b,d),(a,b)
d,a,c:(d,a,c),(d,a),(d)
c,d: (c,d),(c)
b,c,d:(b,c,d),(b,c),(b)
a,c,b:
Noah Misch n...@leadboat.com writes:
On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote:
Tom == Tom Lane t...@sss.pgh.pa.us writes:
Tom That seems pretty grotty from a performance+memory consumption
Tom standpoint. At peak memory usage, each one of the Sort nodes
Tom will contain
Tom == Tom Lane t...@sss.pgh.pa.us writes:
[Noah]
I caution against using window function performance as the
template for GROUPING SETS performance goals. The benefit of
GROUPING SETS compared to its UNION ALL functional equivalent is
15% syntactic pleasantness, 85% performance
On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
Tom The other reason that's a bad comparison is that I've not seen
Tom many queries that use more than a couple of window frames,
Tom whereas we have to expect that the number of grouping sets in
Tom typical
On Tuesday, December 23, 2014, Robert Haas robertmh...@gmail.com wrote:
On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth
and...@tao11.riddles.org.uk javascript:; wrote:
Tom The other reason that's a bad comparison is that I've not seen
Tom many queries that use more than a couple of window
Robert == Robert Haas robertmh...@gmail.com writes:
I would be interested in seeing more good examples of the size and
type of grouping sets used in typical queries.
Robert From what I have seen, there is interest in being able to do
Robert things like GROUP BY CUBE(a, b, c, d) and have
On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote:
I still find the ChainAggregate approach too ugly at a system structural
level to accept, regardless of Noah's argument about number of I/O cycles
consumed. We'll be paying for that in complexity and bugs into the
indefinite future,
On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote:
Tom == Tom Lane t...@sss.pgh.pa.us writes:
I'd already explained in more detail way back when we posted the
patch. But to reiterate: the ChainAggregate nodes pass through
their input data unchanged, but on group boundaries
On Thu, Dec 11, 2014 at 3:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
I don't really have any comments on the algorithms yet, having spent too
much time trying to figure out underdocumented data structures to get to
the algorithms. However, noting the addition of list_intersection_int()
made me
Michael == Michael Paquier michael.paqu...@gmail.com writes:
Michael Based on those comments, I am marking this patch as
Michael Returned with Feedback on the CF app for 2014-10. Andrew,
Michael feel free to move this entry to CF 2014-12 if you are
Michael planning to continue working on it
On Mon, Dec 15, 2014 at 12:28 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
Michael == Michael Paquier michael.paqu...@gmail.com writes:
Michael Based on those comments, I am marking this patch as
Michael Returned with Feedback on the CF app for 2014-10. Andrew,
Michael feel free to
Tom == Tom Lane t...@sss.pgh.pa.us writes:
With the high-priority questions out of the way, time to tackle the
rest:
Tom My single biggest complaint is about the introduction of struct
Tom GroupedVar. If we stick with that, we're going to have to teach
Tom an extremely large number of
Andrew Gierth and...@tao11.riddles.org.uk writes:
Tom == Tom Lane t...@sss.pgh.pa.us writes:
Tom That seems pretty messy, especially given your further comments
Tom that these plan nodes are interconnected and know about each
Tom other (though you failed to say exactly how).
I'd already
Tom == Tom Lane t...@sss.pgh.pa.us writes:
I'd already explained in more detail way back when we posted the
patch. But to reiterate: the ChainAggregate nodes pass through
their input data unchanged, but on group boundaries they write
aggregated result rows to a tuplestore shared by the
Andrew Gierth and...@tao11.riddles.org.uk writes:
Tom == Tom Lane t...@sss.pgh.pa.us writes:
Tom I've not spent any real effort looking at gsp2.patch yet, but it
Tom seems even worse off comment-wise: if there's any explanation in
Tom there at all of what a chained aggregate is, I didn't
Tom == Tom Lane t...@sss.pgh.pa.us writes:
What that code does is produce plans that look like this:
GroupAggregate
- Sort
- ChainAggregate
- Sort
- ChainAggregate
in much the same way that WindowAgg nodes are generated.
Tom That seems pretty messy, especially
Andrew Gierth and...@tao11.riddles.org.uk writes:
And here is that recut patch set.
I started looking over this patch, but eventually decided that it needs
more work to be committable than I'm prepared to put in right now.
My single biggest complaint is about the introduction of struct
On Sat, Sep 27, 2014 at 06:37:38AM +0100, Andrew Gierth wrote:
Andrew == Andrew Gierth and...@tao11.riddles.org.uk writes:
Andrew I was holding off on posting a recut patch with the latest
Andrew EXPLAIN formatting changes (which are basically cosmetic)
Andrew until it became clear
There's been a lot of discussion and I haven't followed it in detail.
Andrew, there were some open questions, but have you gotten enough
feedback so that you know what to do next? I'm trying to get this
commitfest to an end, and this is still in Needs Review state...
- Heikki
--
Sent via
Heikki == Heikki Linnakangas hlinnakan...@vmware.com writes:
Heikki There's been a lot of discussion and I haven't followed it in
Heikki detail. Andrew, there were some open questions, but have you
Heikki gotten enough feedback so that you know what to do next?
I was holding off on posting a
On Fri, Sep 19, 2014 at 4:45 AM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
GroupAggregate (cost=1122.39..1197.48 rows=9 width=8)
Group Key: two, four
Group Key: two
Group Key: ()
Grouping Sets: [
[two, four],
[two],
[]
+1 looks good to me.
Marti == Marti Raudsepp ma...@juffo.org writes:
(yaml format)
Grouping Sets:
- - two
- four
- - two
-
Marti Now this is weird.
You're telling me. Also, feeding it to an online yaml-to-json
converter gives the result as [[two,four],[two],null] which is
not quite the same
On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote:
Marti But is anyone actually using YAML output format, or was it
Marti implemented simply because we can?
Until someone decides to dike it out, I think we are obligated to make
it produce something resembling correct output.
I vote for
Andrew == Andrew Gierth and...@tao11.riddles.org.uk writes:
Andrew You're telling me. Also, feeding it to an online yaml-to-json
Andrew converter gives the result as [[two,four],[two],null]
Andrew which is not quite the same as the json version. An
Andrew alternative would be:
Oh, another
On 19/09/14 17:52, Andres Freund wrote:
On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote:
Marti But is anyone actually using YAML output format, or was it
Marti implemented simply because we can?
Until someone decides to dike it out, I think we are obligated to make
it produce something
On 09/19/2014 08:52 AM, Andres Freund wrote:
Until someone decides to dike it out, I think we are obligated to make
it produce something resembling correct output.
I vote for ripping it out. There really isn't any justification for it
and it broke more than once.
(a) I personally use it all
Josh == Josh Berkus j...@agliodbs.com writes:
Josh (b) If we're going to discuss ripping out YAML format, please
Josh let's do that as a *separate* patch and discussion,
+infinity
Grouping Sets:
- [two,four]
- [two]
- []
Would that be better? (It's not consistent with
Marti == Marti Raudsepp ma...@juffo.org writes:
Marti Since you were asking for feedback on the EXPLAIN output on
Marti IRC, I'd weigh in and say that having the groups on separate
Marti lines would be significantly more readable.
I revisited the explain output a bit and have come up with
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
gsp1.patch - phase 1 code patch (full syntax, limited functionality)
gsp2.patch - phase 2 code patch (adds full functionality using the
new chained aggregate mechanism)
I gave
On 09/17/2014 03:02 PM, Marti Raudsepp wrote:
So instead of:
GroupAggregate
Output: four, ten, hundred, count(*)
Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
onek.ten), (onek.four), ()
Perhaps print:
Grouping Sets: (onek.four, onek.ten, onek.hundred)
Tomas == Tomas Vondra t...@fuzzy.cz writes:
Tomas If we can get rid of the excessive ChainAggregate, that's
Tomas certainly enough for now.
I found an algorithm that should provably give the minimal number of sorts
(I was afraid that problem would turn out to be NP-hard, but not so - it's
On 6.9.2014 23:34, Andrew Gierth wrote:
Tomas == Tomas Vondra t...@fuzzy.cz writes:
Tomas I have significant doubts about the whole design,
Tomas though. Especially the decision not to use HashAggregate,
There is no decision not to use HashAggregate. There is simply no
support for
Tomas == Tomas Vondra t...@fuzzy.cz writes:
It's not one sort per grouping set, it's the minimal number of
sorts needed to express the result as a union of ROLLUP
clauses. The planner code will (I believe) always find the
smallest number of sorts needed.
Tomas You're probably right.
On 7.9.2014 15:11, Andrew Gierth wrote:
Tomas == Tomas Vondra t...@fuzzy.cz writes:
It's not one sort per grouping set, it's the minimal number of
sorts needed to express the result as a union of ROLLUP
clauses. The planner code will (I believe) always find the
smallest number of
Tomas == Tomas Vondra t...@fuzzy.cz writes:
As for computing it all twice, there's currently no attempt to
optimize multiple identical grouping sets into multiple
projections of a single grouping set result. CUBE(a,b,c,a) has
twice as many grouping sets as CUBE(a,b,c) does, even though
On 7.9.2014 18:52, Andrew Gierth wrote:
Tomas == Tomas Vondra t...@fuzzy.cz writes:
Tomas Maybe preventing this completely (i.e. raising an ERROR with
Tomas duplicate columns in CUBE/ROLLUP/... clauses) would be
Tomas appropriate. Does the standard says anything about this?
The spec
On 31.8.2014 22:52, Andrew Gierth wrote:
Recut patches:
gsp1.patch - phase 1 code patch (full syntax, limited functionality)
gsp2.patch - phase 2 code patch (adds full functionality using the
new chained aggregate mechanism)
gsp-doc.patch - docs
Tomas == Tomas Vondra t...@fuzzy.cz writes:
Tomas I have significant doubts about the whole design,
Tomas though. Especially the decision not to use HashAggregate,
There is no decision not to use HashAggregate. There is simply no
support for HashAggregate yet.
Having it be able to work with
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
Erik == Erik Rijkers e...@xs4all.nl writes:
They apply cleanly for me at 2bde297 whether with git apply or
patch, except for the contrib one (which you don't need unless you
want to run the contrib regression tests without applying the
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers e...@xs4all.nl wrote:
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
Erik == Erik Rijkers e...@xs4all.nl writes:
They apply cleanly for me at 2bde297 whether with git apply or
patch, except for the contrib one (which you don't need
On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers e...@xs4all.nl wrote:
I have found that the unrecognized node type error is caused by:
It's a warning, not an error, right?
shared_preload_libraries = pg_stat_statements
in postgresql.conf (as
On Sunday, August 31, 2014, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers e...@xs4all.nl
javascript:; wrote:
I have found that the unrecognized node type error is caused by:
It's a warning, not
On Mon, August 25, 2014 07:21, Andrew Gierth wrote:
Here is the new version of our grouping sets patch. This version
supersedes the previous post.
The patches did not apply anymore so I applied at 73eba19aebe0. There they
applied OK, and make make check was OK.
drop table if exists
Erik == Erik Rijkers e...@xs4all.nl writes:
Erik The patches did not apply anymore so I applied at 73eba19aebe0.
Erik There they applied OK, and make make check was OK.
I'll look and rebase if need be.
-- WARNING: unrecognized node type: 347
Can't reproduce this - are you sure it's not a
Andrew == Andrew Gierth and...@tao11.riddles.org.uk writes:
Erik == Erik Rijkers e...@xs4all.nl writes:
Erik The patches did not apply anymore so I applied at 73eba19aebe0.
Erik There they applied OK, and make make check was OK.
Andrew I'll look and rebase if need be.
They apply cleanly
On Tue, August 26, 2014 11:13, Andrew Gierth wrote:
Andrew == Andrew Gierth and...@tao11.riddles.org.uk writes:
Erik == Erik Rijkers e...@xs4all.nl writes:
Erik The patches did not apply anymore so I applied at 73eba19aebe0.
Erik There they applied OK, and make make check was OK.
Erik == Erik Rijkers e...@xs4all.nl writes:
They apply cleanly for me at 2bde297 whether with git apply or
patch, except for the contrib one (which you don't need unless you
want to run the contrib regression tests without applying the
gsp-u patch).
Erik Ah, I had not realised that.
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
Erik == Erik Rijkers e...@xs4all.nl writes:
They apply cleanly for me at 2bde297 whether with git apply or
patch, except for the contrib one (which you don't need unless you
want to run the contrib regression tests without applying the
Hi
I checked this patch, and it working very well
I found only two issue - I am not sure if it is issue
with data from https://wiki.postgresql.org/wiki/Grouping_Sets
postgres=# select name, place, sum(count), grouping(name), grouping(place)
from cars group by rollup(name, place);
name |
Pavel == Pavel Stehule pavel.steh...@gmail.com writes:
Pavel Hi
Pavel I checked this patch, and it working very well
Pavel I found only two issue - I am not sure if it is issue
Pavel It duplicate rows
Pavel postgres=# explain select name, place, sum(count), grouping(name),
Pavel
2014-08-26 2:45 GMT+02:00 Andrew Gierth and...@tao11.riddles.org.uk:
Pavel == Pavel Stehule pavel.steh...@gmail.com writes:
Pavel Hi
Pavel I checked this patch, and it working very well
Pavel I found only two issue - I am not sure if it is issue
Pavel It duplicate rows
Pavel
76 matches
Mail list logo