Hi,

> * I think we should update truncate_useless_pathkeys() to account for
> the ordering requested by the query's set operation;

Nice catch.

> I'm thinking that maybe it'd be better to move the work of sorting the
> subquery's paths to the outer query level, specifically within the
> build_setop_child_paths() function, just before we stick SubqueryScanPath
> on top of the subquery's paths.  I think this is better because:
>
> 1. This minimizes the impact on subquery planning and reduces the
> footprint within the grouping_planner() function as much as possible.
>
> 2. This can help avoid the aforementioned add_path() issue because the
> two involved paths will be structured as:
>
>     cheapest_path -> subqueryscan
> and
>     cheapest_path -> sort -> subqueryscan
>
> If the two paths cost fuzzily the same and add_path() decides to keep
> the second one due to it having better pathkeys and pfree the first one,
> it would not be a problem.

This is a smart idea, it works because you create a two different
subqueryscan for the cheapest_input_path.

FWIW, I found we didn't create_sort_path during building a merge join
path, instead it just cost the sort and add it to the cost of mergejoin
path only and note this path needs a presorted data. At last during the
create_mergejoin_*plan*, it create the sort_plan really. As for the
mergeappend case, could we use the similar strategy? with this way, we
might simpliy the code to use MergeAppend node since the caller just
need to say I want to try MergeAppend with the given pathkeys without
really creating the sort by themselves. 

(Have a quick glance of initial_cost_mergejoin and
create_mergejoin_plan, looks incremental sort doesn't work with mergejoin?)

>
> To assist the discussion I've attached a diff file that includes all the
> changes above.

+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+       int                     n_common_pathkeys;
+
+       if (root->setop_pathkeys == NIL)
+               return 0;                               /* no special setop 
ordering requested */
+
+       if (pathkeys == NIL)
+               return 0;                               /* unordered path */
+
+       (void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+                                                                          
&n_common_pathkeys);
+
+       return n_common_pathkeys;
+}

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

-- 
Best Regards
Andy Fan

>From f562dd8b80bd0b7c2d66ee9633434a1e5be82e04 Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" <yizhi....@alibaba-inc.com>
Date: Wed, 7 Feb 2024 07:00:33 +0800
Subject: [PATCH v1 1/1] Remove the unnecessary checks for pathkey routines

both pathkeys_count_contained_in and foreach can handle the NIL input,
so no need the extra checks.
---
 src/backend/optimizer/path/pathkeys.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 82ff31273b..98334e5c52 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -2126,12 +2126,6 @@ pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
 {
 	int			n_common_pathkeys;
 
-	if (root->query_pathkeys == NIL)
-		return 0;				/* no special ordering requested */
-
-	if (pathkeys == NIL)
-		return 0;				/* unordered path */
-
 	(void) pathkeys_count_contained_in(root->query_pathkeys, pathkeys,
 									   &n_common_pathkeys);
 
@@ -2167,10 +2161,6 @@ pathkeys_useful_for_grouping(PlannerInfo *root, List *pathkeys)
 	if (root->group_pathkeys == NIL)
 		return 0;
 
-	/* unordered path */
-	if (pathkeys == NIL)
-		return 0;
-
 	/* walk the pathkeys and search for matching group key */
 	foreach(key, pathkeys)
 	{
-- 
2.34.1

Reply via email to