Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Masahiko Sawada Mon, 02 Jul 2018 05:02:53 -0700

On Mon, Jul 2, 2018 at 5:25 PM, Daniel Gustafsson <dan...@yesql.se> wrote:
>> On 26 Jun 2018, at 17:11, Alexander Kuzmenkov <a.kuzmen...@postgrespro.ru> 
>> wrote:
>
>> I took a look at the patch. It applies and compiles, the tests pass.
>
> Thanks for reviewing, and apologies for the slow response.
>
>> Some thoughts about the code:
>>
>> * Postgres lists cache their lengths, so you don't need uniqueLen.
>
> Good point, fixed.
>
>> * Use an array of WindowClauseSortNode's instead of a list. It's more 
>> suitable here because you are going to sort (qsort_list creates a temporary 
>> array).
>
> I was thinking about that, but opted for code simplicity with a List approach.
> The required size of the array isn’t known ahead of time, so it must either
> potentially overallocate to the upper bound of root->parse->windowClause or 
> use
> heuristics and risk reallocating when growing, neither of which is terribly
> appealing.  Do you have any suggestions or preferences?
>
>> * Reversing the sorted list looks more confusing to me than just sorting it 
>> in the proper order right away. A qsort() comparator returning negative 
>> means "left goes before right", but here it is effectively the other way 
>> around.
>
> Changed.
>
>> * There is a function named make_pathkeys_for_window that makes a list of 
>> canonical pathkeys given a window clause. Using this function to give you 
>> the pathkeys, and then comparing them, would be more future-proof in case we 
>> ever start using hashing for windowing. Moreover, the canonical pathkeys can 
>> be compared with pointer comparison which is much faster than equal(). 
>> Still, I'm not sure whether it's going to be convenient to use in this case, 
>> or even whether it is a right thing to do. What do you think?
>
> That’s an interesting thought, one that didn’t occur to me while hacking.  I’m
> not sure whether is would be wise/clean to overload with this functionality
> though.
>
> Attached updated version also adds a testcase that was clearly missing from 
> the
> previous version and an updated window.out.
>
> cheers ./daniel
>


Thank you for updating the patch! There are two review comments.

The current select_active_windows() function compares the all fields
of WindowClause for the sorting but with this patch we compare only
tleSortGroupRef, sortop and the number of uniqueOrder. I think this
leads a degradation as follows.

=# explain select *, sum(b) over w1, sum(a) over w2, sum(b) over w3
from w window w1 as (partition by a order by a nulls first), w2 as
(partition by a order by a), w3 as (partition by a order by a nulls
first);

* Current HEAD
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 WindowAgg  (cost=369.16..414.36 rows=2260 width=32)
   ->  Sort  (cost=369.16..374.81 rows=2260 width=24)
         Sort Key: a
         ->  WindowAgg  (cost=158.51..243.26 rows=2260 width=24)
               ->  WindowAgg  (cost=158.51..203.71 rows=2260 width=16)
                     ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                           Sort Key: a NULLS FIRST
                           ->  Seq Scan on w  (cost=0.00..32.60
rows=2260 width=8)
(8 rows)

* With patch
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 WindowAgg 3  (cost=500.72..545.92 rows=2260 width=32)
   ->  Sort  (cost=500.72..506.37 rows=2260 width=24)
         Sort Key: a NULLS FIRST
         ->  WindowAgg 2  (cost=329.61..374.81 rows=2260 width=24)
               ->  Sort  (cost=329.61..335.26 rows=2260 width=16)
                     Sort Key: a
                     ->  WindowAgg 1  (cost=158.51..203.71 rows=2260 width=16)
                           ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                                 Sort Key: a NULLS FIRST
                                 ->  Seq Scan on w  (cost=0.00..32.60
rows=2260 width=8)
(10 rows)

---
+                        * Generating the uniqueOrder can be offloaded
to the comparison
+                        * function to optimize for the case where we
only have a single
+                        * window.  For now, optimize for readibility.

s/readibility/readability/

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

Reply via email to