Re: make add_paths_to_append_rel aware of startup cost

2024-02-15 Thread David Rowley
On Fri, 16 Feb 2024 at 01:09, David Rowley  wrote:
>
> On Thu, 15 Feb 2024 at 21:42, Andy Fan  wrote:
> > I found the both plans have the same cost, I can't get the accurate
> > cause of this after some hours research, but it is pretty similar with
> > 7516056c584e3, so I uses a similar strategy to stable it. is it
> > acceptable?
>
> It's pretty hard to say.  I can only guess why this test would be
> flapping like this. I see it's happened before on mylodon, so probably
> not a cosmic ray.  It's not like add_path() chooses a random path when
> the costs are the same, so I wondered if something similar is going on
> here that was going on that led to f03a9ca4. In particular, see [1].

While it's not conclusive proof, the following demonstrates that
relpages dropping by just 1 page causes the join order to change.

regression=# explain
regression-# select t1.unique1 from tenk1 t1
regression-# inner join tenk2 t2 on t1.tenthous = t2.tenthous
regression-# union all
regression-# (values(1)) limit 1;
  QUERY PLAN
--
 Limit  (cost=0.00..150.08 rows=1 width=4)
   ->  Append  (cost=0.00..1500965.01 rows=10001 width=4)
 ->  Nested Loop  (cost=0.00..1500915.00 rows=1 width=4)
   Join Filter: (t1.tenthous = t2.tenthous)
   ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=1 width=8)
   ->  Materialize  (cost=0.00..495.00 rows=1 width=4)
 ->  Seq Scan on tenk2 t2  (cost=0.00..445.00
rows=1 width=4)
 ->  Result  (cost=0.00..0.01 rows=1 width=4)

regression=# update pg_class set relpages=relpages - 1 where relname = 'tenk2';
UPDATE 1
regression=# explain
regression-# select t1.unique1 from tenk1 t1
regression-# inner join tenk2 t2 on t1.tenthous = t2.tenthous
regression-# union all
regression-# (values(1)) limit 1;
  QUERY PLAN
--
 Limit  (cost=0.00..150.52 rows=1 width=4)
   ->  Append  (cost=0.00..1505315.30 rows=10001 width=4)
 ->  Nested Loop  (cost=0.00..1505265.29 rows=1 width=4)
   Join Filter: (t1.tenthous = t2.tenthous)
   ->  Seq Scan on tenk2 t2  (cost=0.00..445.29 rows=10029 width=4)
   ->  Materialize  (cost=0.00..495.00 rows=1 width=8)
 ->  Seq Scan on tenk1 t1  (cost=0.00..445.00
rows=1 width=8)
 ->  Result  (cost=0.00..0.01 rows=1 width=4)

I tried this with the proposed changes to the test and the plan did not change.

I've pushed the change now.

David




Re: make add_paths_to_append_rel aware of startup cost

2024-02-15 Thread Andy Fan


David Rowley  writes:

> I'd be more happy using this one as percentage-wise, the cost
> difference is much larger.

+1 for the percentage-wise.
>
> I checked that the t2.thounsand = 0 query still tests the cheap
> startup paths in add_paths_to_append_rel().

I get the same conclusion here. 
>
> So, if nobody has any better ideas, I'm just going to push the " and
> t2.thousand = 0" adjustment.

LGTM.

-- 
Best Regards
Andy Fan





Re: make add_paths_to_append_rel aware of startup cost

2024-02-15 Thread David Rowley
On Thu, 15 Feb 2024 at 21:42, Andy Fan  wrote:
> I found the both plans have the same cost, I can't get the accurate
> cause of this after some hours research, but it is pretty similar with
> 7516056c584e3, so I uses a similar strategy to stable it. is it
> acceptable?

It's pretty hard to say.  I can only guess why this test would be
flapping like this. I see it's happened before on mylodon, so probably
not a cosmic ray.  It's not like add_path() chooses a random path when
the costs are the same, so I wondered if something similar is going on
here that was going on that led to f03a9ca4. In particular, see [1].

On master, I set a breakpoint in try_nestloop_path() to break on
"outerrel->relid==1 && innerrel->relid==2". I see the total Nested
Loop cost comes out the same with the join order reversed.

Which is:

 ->  Nested Loop  (cost=0.00..1500915.00 rows=1 width=4)

Doing the same with your patch applied, I get:

->  Nested Loop  (cost=0.00..600925.00 rows=4000 width=4)

and forcing the join order to swap with the debugger, I see:

->  Nested Loop  (cost=0.00..600940.00 rows=4000 width=4)

So there's a difference now, but it's quite small. If it was a problem
like we had on [1], then since tenk1 and tenk2 have 345 pages (on my
machine), if relpages is down 1 or 2 pages, we'll likely get more of a
costing difference than 600925 vs 600940.

If I use:

explain
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
union all
(values(1)) limit 1;

I get:

->  Nested Loop  (cost=0.00..2415.03 rows=10 width=4)

and with the join order reversed, I get:

 ->  Nested Loop  (cost=0.00..2440.00 rows=10 width=4)

I'd be more happy using this one as percentage-wise, the cost
difference is much larger.  I don't quite have the will to go through
proving what the actual problem is here. I think [1] already proved
the relpages problem can (or could) happen.

I checked that the t2.thounsand = 0 query still tests the cheap
startup paths in add_paths_to_append_rel() and it does. If I flip
startup_subpaths_valid to false in the debugger, the plan flips to:

QUERY PLAN
---
 Limit  (cost=470.12..514.00 rows=1 width=4)
   ->  Append  (cost=470.12..952.79 rows=11 width=4)
 ->  Hash Join  (cost=470.12..952.73 rows=10 width=4)
   Hash Cond: (t1.tenthous = t2.tenthous)
   ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=1 width=8)
   ->  Hash  (cost=470.00..470.00 rows=10 width=4)
 ->  Seq Scan on tenk2 t2  (cost=0.00..470.00
rows=10 width=4)
   Filter: (thousand = 0)
 ->  Result  (cost=0.00..0.01 rows=1 width=4)

So, if nobody has any better ideas, I'm just going to push the " and
t2.thousand = 0" adjustment.

David

[1] https://www.postgresql.org/message-id/4174.1563239552%40sss.pgh.pa.us




Re: make add_paths_to_append_rel aware of startup cost

2024-02-15 Thread Andy Fan

Andy Fan  writes:

> Thomas Munro  writes:
>
>> On Thu, Oct 5, 2023 at 9:07 PM David Rowley  wrote:
>>> Thanks. Pushed.
>>
>> FYI somehow this plan from a8a968a8212e flipped in this run:
>>
>> === dumping 
>> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
>> ===
>> diff -U3 
>> /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
>> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
>> --- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
>> 2024-01-15 00:31:13.947555940 +
>> +++ 
>> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
>> 2024-02-14 00:06:17.075584839 +
>> @@ -1447,9 +1447,9 @@
>> ->  Append
>>   ->  Nested Loop
>> Join Filter: (t1.tenthous = t2.tenthous)
>> -   ->  Seq Scan on tenk1 t1
>> +   ->  Seq Scan on tenk2 t2
>> ->  Materialize
>> - ->  Seq Scan on tenk2 t2
>> + ->  Seq Scan on tenk1 t1
>>   ->  Result
>>  (8 rows)
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2024-02-14%2000%3A01%3A03
>
> Thanks for this information! I will take a look at this.

I found the both plans have the same cost, I can't get the accurate
cause of this after some hours research, but it is pretty similar with
7516056c584e3, so I uses a similar strategy to stable it. is it
acceptable? 

-- 
Best Regards
Andy Fan

>From 01c2b5ae76a4493f33b894afdd4a8d3ce31e1a3e Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" 
Date: Thu, 15 Feb 2024 16:29:30 +0800
Subject: [PATCH v1 1/1] Attempt to stabilize new unionall + limit regression
 test

This test was recently added in a8a968a8212e, and some times It appears
to be unstable in regards to the join order presumably due to the
relations at either side of the join being equal in side.  Here we add a
qual to make one of them smaller so the planner is more likely to choose
to material the smaller of the two.

Reported-by: Thomas Munro
Author: Andy Fan

Discussion: https://postgr.es/m/CA%2BhUKGLqC-NobKYfjxNM3Gexv9OJ-Fhvy9bugUcXsZjTqH7W%3DQ%40mail.gmail.com#88e6420b59a863403b9b67a0128fdacc
---
 src/test/regress/expected/union.out | 11 ++-
 src/test/regress/sql/union.sql  |  4 ++--
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 73e320bad4..0e527b65b3 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1438,8 +1438,8 @@ where (x = 0) or (q1 >= q2 and q1 <= q2);
 -- Ensure we get a Nested Loop join between tenk1 and tenk2
 explain (costs off)
 select t1.unique1 from tenk1 t1
-inner join tenk2 t2 on t1.tenthous = t2.tenthous
-   union all
+inner join tenk2 t2 on t1.tenthous = t2.tenthous and t1.ten > 5
+union all
 (values(1)) limit 1;
QUERY PLAN   
 
@@ -1447,11 +1447,12 @@ inner join tenk2 t2 on t1.tenthous = t2.tenthous
->  Append
  ->  Nested Loop
Join Filter: (t1.tenthous = t2.tenthous)
-   ->  Seq Scan on tenk1 t1
+   ->  Seq Scan on tenk2 t2
->  Materialize
- ->  Seq Scan on tenk2 t2
+ ->  Seq Scan on tenk1 t1
+   Filter: (ten > 5)
  ->  Result
-(8 rows)
+(9 rows)
 
 -- Ensure there is no problem if cheapest_startup_path is NULL
 explain (costs off)
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 6c509ac80c..13ba9f21ad 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -548,8 +548,8 @@ where (x = 0) or (q1 >= q2 and q1 <= q2);
 -- Ensure we get a Nested Loop join between tenk1 and tenk2
 explain (costs off)
 select t1.unique1 from tenk1 t1
-inner join tenk2 t2 on t1.tenthous = t2.tenthous
-   union all
+inner join tenk2 t2 on t1.tenthous = t2.tenthous and t1.ten > 5
+union all
 (values(1)) limit 1;
 
 -- Ensure there is no problem if cheapest_startup_path is NULL
-- 
2.34.1



Re: make add_paths_to_append_rel aware of startup cost

2024-02-14 Thread Andy Fan


Thomas Munro  writes:

> On Thu, Oct 5, 2023 at 9:07 PM David Rowley  wrote:
>> Thanks. Pushed.
>
> FYI somehow this plan from a8a968a8212e flipped in this run:
>
> === dumping 
> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
> ===
> diff -U3 
> /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
> --- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
> 2024-01-15 00:31:13.947555940 +
> +++ 
> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
> 2024-02-14 00:06:17.075584839 +
> @@ -1447,9 +1447,9 @@
> ->  Append
>   ->  Nested Loop
> Join Filter: (t1.tenthous = t2.tenthous)
> -   ->  Seq Scan on tenk1 t1
> +   ->  Seq Scan on tenk2 t2
> ->  Materialize
> - ->  Seq Scan on tenk2 t2
> + ->  Seq Scan on tenk1 t1
>   ->  Result
>  (8 rows)
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2024-02-14%2000%3A01%3A03

Thanks for this information! I will take a look at this.

-- 
Best Regards
Andy Fan





Re: make add_paths_to_append_rel aware of startup cost

2024-02-13 Thread Thomas Munro
On Thu, Oct 5, 2023 at 9:07 PM David Rowley  wrote:
> Thanks. Pushed.

FYI somehow this plan from a8a968a8212e flipped in this run:

=== dumping 
/home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
===
diff -U3 
/home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
/home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
--- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
2024-01-15 00:31:13.947555940 +
+++ 
/home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
2024-02-14 00:06:17.075584839 +
@@ -1447,9 +1447,9 @@
->  Append
  ->  Nested Loop
Join Filter: (t1.tenthous = t2.tenthous)
-   ->  Seq Scan on tenk1 t1
+   ->  Seq Scan on tenk2 t2
->  Materialize
- ->  Seq Scan on tenk2 t2
+ ->  Seq Scan on tenk1 t1
  ->  Result
 (8 rows)

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2024-02-14%2000%3A01%3A03




Re: make add_paths_to_append_rel aware of startup cost

2023-10-05 Thread David Rowley
On Thu, 5 Oct 2023 at 14:11, Andy Fan  wrote:
> Patch LGTM.

Thanks. Pushed.

David




Re: make add_paths_to_append_rel aware of startup cost

2023-10-04 Thread Andy Fan
On Wed, Oct 4, 2023 at 8:41 AM David Rowley  wrote:

> On Sun, 1 Oct 2023 at 21:26, Andy Fan  wrote:
> >> But overall, I'm more inclined to just go with the more simple "add a
> >> cheap unordered startup append path if considering cheap startup
> >> plans" version. I see your latest patch does both. So, I'd suggest two
> >> patches as I do see the merit in keeping this simple and cheap.  If we
> >> can get the first part in and you still find cases where you're not
> >> getting the most appropriate startup plan based on the tuple fraction,
> >> then we can reconsider what extra complexity we should endure in the
> >> code based on the example query where we've demonstrated the planner
> >> is not choosing the best startup path appropriate to the given tuple
> >> fraction.
> >
> > I think this is a fair point,  I agree that your first part is good
> enough to be
> > committed first.   Actually I tried a lot to make a test case which can
> prove
> > the value of cheapest fractional cost but no gain so far:(
>
> I've attached a patch with the same code as the previous patch but
> this time including a regression test.
>
> I see no reason to not commit this so if anyone feels differently
> please let me know.
>
> David
>

Patch LGTM.

-- 
Best Regards
Andy Fan


Re: make add_paths_to_append_rel aware of startup cost

2023-10-03 Thread David Rowley
On Sun, 1 Oct 2023 at 21:26, Andy Fan  wrote:
>> But overall, I'm more inclined to just go with the more simple "add a
>> cheap unordered startup append path if considering cheap startup
>> plans" version. I see your latest patch does both. So, I'd suggest two
>> patches as I do see the merit in keeping this simple and cheap.  If we
>> can get the first part in and you still find cases where you're not
>> getting the most appropriate startup plan based on the tuple fraction,
>> then we can reconsider what extra complexity we should endure in the
>> code based on the example query where we've demonstrated the planner
>> is not choosing the best startup path appropriate to the given tuple
>> fraction.
>
> I think this is a fair point,  I agree that your first part is good enough to 
> be
> committed first.   Actually I tried a lot to make a test case which can  prove
> the value of cheapest fractional cost but no gain so far:(

I've attached a patch with the same code as the previous patch but
this time including a regression test.

I see no reason to not commit this so if anyone feels differently
please let me know.

David


consider_cheapest_startup_appendpath_v2.patch
Description: Binary data


Re: make add_paths_to_append_rel aware of startup cost

2023-10-01 Thread Andy Fan
 Hi David,

But overall, I'm more inclined to just go with the more simple "add a
> cheap unordered startup append path if considering cheap startup
> plans" version. I see your latest patch does both. So, I'd suggest two
> patches as I do see the merit in keeping this simple and cheap.  If we
> can get the first part in and you still find cases where you're not
> getting the most appropriate startup plan based on the tuple fraction,
> then we can reconsider what extra complexity we should endure in the
> code based on the example query where we've demonstrated the planner
> is not choosing the best startup path appropriate to the given tuple
> fraction.
>

I think this is a fair point,  I agree that your first part is good enough
to be
committed first.   Actually I tried a lot to make a test case which can
prove
the value of cheapest fractional cost but no gain so far:(

-- 
Best Regards
Andy Fan


Re: make add_paths_to_append_rel aware of startup cost

2023-09-27 Thread David Rowley
On Mon, 18 Sept 2023 at 22:55, Andy Fan  wrote:
> Here is an updated version to show what's in my mind.

My current thoughts on this are that the fractional cost part adds
quite a bit more complexity than the minimalistic approach of just
also considering the cheapest startup path.

There's also quite a bit I don't like about the extra code you've added:

1. RelOptInfo.tuple_fraction is not given a default value in locations
where we do makeNode(RelOptInfo);

2. This is very poorly documented and badly named. Also seems to have
a typo "stopper"

+ /* Like order by, group by, distinct and so. */
+ bool has_stoper_op;

With that, nobody has a hope of knowing if some new operation should
set that value to true or false.

I think it needs to define the meaning, which I think (very roughly)
is "does the query require any additional upper-planner operations
which could require having to read more tuples from the final join
relation than the number of tuples which are read from the final upper
rel."

3. get_fractional_path_cost() goes to the trouble of setting
total_rows then does not use it.

4. I don't see why it's ok to take the total_rows from the first Path
in the list in get_cheapest_fractional_path_ext(). What if another
Path has some other value?

But overall, I'm more inclined to just go with the more simple "add a
cheap unordered startup append path if considering cheap startup
plans" version. I see your latest patch does both. So, I'd suggest two
patches as I do see the merit in keeping this simple and cheap.  If we
can get the first part in and you still find cases where you're not
getting the most appropriate startup plan based on the tuple fraction,
then we can reconsider what extra complexity we should endure in the
code based on the example query where we've demonstrated the planner
is not choosing the best startup path appropriate to the given tuple
fraction.

David




Re: make add_paths_to_append_rel aware of startup cost

2023-09-18 Thread Andy Fan
Hi,


> What do you think about this in my last reply?  "If my above
> analysis is correct,  I think the best way to  handle this is if there
> is no more tables to join, we use cheapest  fraction cost for all
> the kinds of relations,  including plain relation,  append rel,
> subquery and so on, If we have more tables to join,  we use
> cheapest startup cost.".  This is what is in my mind now.
>
>
Here is an updated version to show what's in my mind.

-- 
Best Regards
Andy Fan


v3-0001-make-add_paths_to_append_rel-aware-of-startup_cos.patch
Description: Binary data


Re: make add_paths_to_append_rel aware of startup cost

2023-09-18 Thread Andy Fan
On Mon, Sep 18, 2023 at 11:58 AM David Rowley  wrote:

> On Mon, 18 Sept 2023 at 01:42, Andy Fan  wrote:
> > On Fri, Sep 15, 2023 at 3:15 PM David Rowley 
> wrote:
> >> Instead of doing that, why don't you just create a completely new
> >> AppendPath containing all the cheapest_startup_paths and add that to
> >> the append rel. You can optimise this and only do it when
> >> rel->consider_startup is true.
> >>
> >> Does the attached do anything less than what your patch does?
> >
> >
> > We can work like this,  but  there is one difference from what
> > my current patch does. It is cheapest_startup_path vs cheapest
> > fraction path.  For example if we have the following 3 paths with
> > all of the estimated rows is 100 and the tuple_fraction is 10.
>
> Yeah, it's true that the version I wrote didn't consider the
> fractional part, but I didn't really see it as worse than what you
> did. It looks like you're assuming that every append child will have

the same number of tuples read from it, but it seems to me that it
> would only be valid to use the fractional part for the first child.

The path chosen for subsequent child paths would, if done correctly,
> need to account for the estimated rows from the previous child paths.
>

Actually this is  consistent with what generate_union_paths does now.

 /*
* If plain UNION, tell children to fetch all tuples.
*
* Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
* each arm of the UNION ALL.  One could make a case for reducing the
* tuple fraction for later arms (discounting by the expected size of the
* earlier arms' results) but it seems not worth the trouble. The normal
* case where tuple_fraction isn't already zero is a LIMIT at top level,
* and passing it down as-is is usually enough to get the desired result
* of preferring fast-start plans.
*/
if (!op->all)
root->tuple_fraction = 0.0;

UNION ALL is pretty like append rel.


It's not valid here to copy the code in generate_orderedappend_paths()
> as MergeAppend won't necessarily exhaust the first child subpath first
> like Append will.
>

Not sure which code you are referring to,  but the code I refer to
much is generate_union_paths and set_subquery_pathlist.


> > Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost.
> > Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
> > Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost
> >
> > So with the patch you propose,  Path 1 & Path 3 are chosen to build
> > append path.  but with my current patch,  Only path 3 is kept.  IIUC,
> > path 3 should be the best one in this case.
>
> I assume you mean mine would build AppendPaths for 1+2, not 1+3.


Yes, it should be 1+2.

>
>
You mentioned:
>
> > I just find it is hard to build the case for Identifier 2/3/4.
>
> I wonder if this is because generate_orderedappend_paths() handles
> startup paths and most cases will that need a cheap startup plan will
> require some sort of pathkeys.
>

Probably yes.


> The example you mentioned of:
>
> (select * from tenk1 order by hundred)
>  UNION ALL
>  (select * from tenk1 order by hundred)
>  limit 3;
>
> I don't find this to be a compellingly real-world case.  The planner
> is under no obligation to output rows from the 1st branch of the UNION
> ALL before the 2nd one. If the user cared about that then they'd have
> instead added a top-level ORDER BY, in which case the planner seems
> happy to use the index scan:
>

Sorry about the test case,  here is the one with more compelling
real-world.

with s1 as (select * from tenk1 join tenk2 using (hundred)),
s2 as (select * from tenk1 join tenk2 using (hundred))
select * from s1
union all
select * from s2
limit 3;

It would be good if you could provide a bit more detail on the cases
> you want to improve here.  For example, if your #4 case, you have
> "WHERE xxx". I don't know if "xxx" is just a base qual or if there's a
> correlation to the outer query in there.
>

for the #4, the quickest test case is

select * from tenk1 where exists
(
with s1 as (select * from tenk1 join tenk2 using (hundred)),
s2 as (select * from tenk1 join tenk2 using (hundred))
select * from s1
union all
select * from s2
where random() > 0.4);

random() is used to make it can't be pull-up.   and exists implies
LIMIT 1;


Another concern I have with your patch is that it seems to be under
> the impression that there being further joins to evaluate at this
> query level is the only reason that we would have to pull more than
> the tuple fraction number of rows from the query. What gives you the
> confidence that's the only reason we may want to pull more than the
> tuple fraction of tuples from the append child?
>

I think you are talking about something like ORDER BY, GROUP BY
clause, I  do overlook it.  but if we admit cheapest fractional cost
is a right factor to consider,  this issue is not unresolvable since parse
is at hand.

At last, you ignore the part of 

Re: make add_paths_to_append_rel aware of startup cost

2023-09-17 Thread David Rowley
On Mon, 18 Sept 2023 at 01:42, Andy Fan  wrote:
> On Fri, Sep 15, 2023 at 3:15 PM David Rowley  wrote:
>> Instead of doing that, why don't you just create a completely new
>> AppendPath containing all the cheapest_startup_paths and add that to
>> the append rel. You can optimise this and only do it when
>> rel->consider_startup is true.
>>
>> Does the attached do anything less than what your patch does?
>
>
> We can work like this,  but  there is one difference from what
> my current patch does. It is cheapest_startup_path vs cheapest
> fraction path.  For example if we have the following 3 paths with
> all of the estimated rows is 100 and the tuple_fraction is 10.

Yeah, it's true that the version I wrote didn't consider the
fractional part, but I didn't really see it as worse than what you
did.  It looks like you're assuming that every append child will have
the same number of tuples read from it, but it seems to me that it
would only be valid to use the fractional part for the first child.
The path chosen for subsequent child paths would, if done correctly,
need to account for the estimated rows from the previous child paths.
It's not valid here to copy the code in generate_orderedappend_paths()
as MergeAppend won't necessarily exhaust the first child subpath first
like Append will.

> Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost.
> Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
> Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost
>
> So with the patch you propose,  Path 1 & Path 3 are chosen to build
> append path.  but with my current patch,  Only path 3 is kept.  IIUC,
> path 3 should be the best one in this case.

I assume you mean mine would build AppendPaths for 1+2, not 1+3.

You mentioned:

> I just find it is hard to build the case for Identifier 2/3/4.

I wonder if this is because generate_orderedappend_paths() handles
startup paths and most cases will that need a cheap startup plan will
require some sort of pathkeys.

The example you mentioned of:

(select * from tenk1 order by hundred)
 UNION ALL
 (select * from tenk1 order by hundred)
 limit 3;

I don't find this to be a compellingly real-world case.  The planner
is under no obligation to output rows from the 1st branch of the UNION
ALL before the 2nd one. If the user cared about that then they'd have
instead added a top-level ORDER BY, in which case the planner seems
happy to use the index scan:

regression=# explain (costs off) (select * from tenk1) UNION ALL
(select * from tenk1) order by hundred limit 3;
 QUERY PLAN
-
 Limit
   ->  Merge Append
 Sort Key: tenk1.hundred
 ->  Index Scan using tenk1_hundred on tenk1
 ->  Index Scan using tenk1_hundred on tenk1 tenk1_1

It would be good if you could provide a bit more detail on the cases
you want to improve here.  For example, if your #4 case, you have
"WHERE xxx". I don't know if "xxx" is just a base qual or if there's a
correlation to the outer query in there.

Another concern I have with your patch is that it seems to be under
the impression that there being further joins to evaluate at this
query level is the only reason that we would have to pull more than
the tuple fraction number of rows from the query. What gives you the
confidence that's the only reason we may want to pull more than the
tuple fraction of tuples from the append child?

David




Re: make add_paths_to_append_rel aware of startup cost

2023-09-17 Thread Andy Fan
Hi David,
  Thanks for taking a look at this!

On Fri, Sep 15, 2023 at 3:15 PM David Rowley  wrote:

> On Thu, 7 Sept 2023 at 04:37, Andy Fan  wrote:
> > Currently add_paths_to_append_rel overlooked the startup cost for
> creating
> > append path, so it may have lost some optimization chances.  After a
> glance,
> > the following 4 identifiers can be impacted.
>
> > - We shouldn't do the optimization if there are still more tables to
> join,
> >   the reason is similar to has_multiple_baserels(root) in
> >   set_subquery_pathlist. But the trouble here is we may inherit multiple
> >   levels to build an appendrel, so I have to keep the 'top_relids' all
> the
> >   time and compare it with PlannerInfo.all_baserels. If they are the
> same,
> >   then it is the case we want to optimize.
>
> I think you've likely gone to the trouble of trying to determine if
> there are any joins pending because you're considering using a cheap
> startup path *instead* of the cheapest total path and you don't want
> to do that when some join will cause all the rows to be read thus
> making the plan more expensive if a cheap startup path was picked.
>

Yes, that's true.  However it is not something we can't resolve, one
of the solutions is just like what I did in the patch.  but currently the
main stuff which confuses me is if it is the right thing to give up the
optimization if it has more tables to join (just like set_subquery_pathlist
did).


> Instead of doing that, why don't you just create a completely new
> AppendPath containing all the cheapest_startup_paths and add that to
> the append rel. You can optimise this and only do it when
> rel->consider_startup is true.
>
> Does the attached do anything less than what your patch does?
>

We can work like this,  but  there is one difference from what
my current patch does. It is cheapest_startup_path vs cheapest
fraction path.  For example if we have the following 3 paths with
all of the estimated rows is 100 and the tuple_fraction is 10.

Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost.
Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost

So with the patch you propose,  Path 1 & Path 3 are chosen to build
append path.  but with my current patch,  Only path 3 is kept.  IIUC,
path 3 should be the best one in this case.

We might also argue why Path 3 is kept in the first place (the children
level), I think pathkey might be one option.  and even path 3 is
discarded somehow, I think only if it is the best one, we should
keep it ideally.

Another tiny factor of this is your propose isn't consistent with
what set_subquery_pathlist which uses cheapest fractional cost
and my proposal isn't consistent plain rel which uses cheapest
startup cost.  We can't say which one is better, though.

If my above analysis is correct,  I think the best way to handle this
is if there is no more tables to join, we use cheapest fraction cost
for all the kinds of relations, including plain relation, append rel,
subquery and so on.  If we have more tables to join,  we use
cheapest startup cost.  On the implementation side, I want to use
RelOptInfo.tuple_fraction instead of RelOptInfo.consider_startup.
tuple_fraction = -1 means startup cost should not be considered.
tuple_fraction = 0 means cheapest startup cost should be used.
tuple_franction > 0 means cheapest fraction cost should be used.

I still don't pay enough attention to consider_param_startup in
RelOptInfo,  I'm feeling the above strategy will not generate
too much overhead to the planner for now while it can provides
a better plan sometimes.

-- 
Best Regards
Andy Fan


Re: make add_paths_to_append_rel aware of startup cost

2023-09-15 Thread David Rowley
On Thu, 7 Sept 2023 at 04:37, Andy Fan  wrote:
> Currently add_paths_to_append_rel overlooked the startup cost for creating
> append path, so it may have lost some optimization chances.  After a glance,
> the following 4 identifiers can be impacted.

> - We shouldn't do the optimization if there are still more tables to join,
>   the reason is similar to has_multiple_baserels(root) in
>   set_subquery_pathlist. But the trouble here is we may inherit multiple
>   levels to build an appendrel, so I have to keep the 'top_relids' all the
>   time and compare it with PlannerInfo.all_baserels. If they are the same,
>   then it is the case we want to optimize.

I think you've likely gone to the trouble of trying to determine if
there are any joins pending because you're considering using a cheap
startup path *instead* of the cheapest total path and you don't want
to do that when some join will cause all the rows to be read thus
making the plan more expensive if a cheap startup path was picked.

Instead of doing that, why don't you just create a completely new
AppendPath containing all the cheapest_startup_paths and add that to
the append rel. You can optimise this and only do it when
rel->consider_startup is true.

Does the attached do anything less than what your patch does?

David


consider_cheapest_startup_appendpath.patch
Description: Binary data


Re: make add_paths_to_append_rel aware of startup cost

2023-09-13 Thread Andy Fan
> - We shouldn't do the optimization if there are still more tables to join,
>   the reason is similar to has_multiple_baserels(root) in
>   set_subquery_pathlist.
>

After some internal discussion, we have 2 different choices here. Let's
call the current choice as method-a, and the new choice as method-b.
Method-b will just ignore the "no more tables to join "limitation
and build the append path with both cheapest startup cost and cheapest
total cost, this is pretty like the method we joins a plain relation with
another relation. The uneasy part is it is the cheapest start up cost
rather than the cheapest fractional cost.

method-a is pretty same as what set_subquery_pathlist is doing, which has
a limitation on "no more tables to join" and has no the "cheapest startup
cost" part.

Ideally we can apply both strategies if we don't consider the effort. If
there are no more tables to join, we use method-a. otherwise use
method-b. With this thinking, we can even apply the same strategy to plain
relations as well.

However, I am not sure if the "cheapest startup cost" is a real problem.
If it is not, we can apply method-b directly and modify
set_subquery_pathlist to do the same for consistency.


-- 
Best Regards
Andy Fan


make add_paths_to_append_rel aware of startup cost

2023-09-06 Thread Andy Fan
Hi:

This thread is a refactor of thread [1] for easier communication.

Currently add_paths_to_append_rel overlooked the startup cost for creating
append path, so it may have lost some optimization chances.  After a
glance,
the following 4 identifiers can be impacted.

Identifier 1:

SELECT .. FROM v1
UNION ALL
SELECT .. FROM v2
LIMIT 3;

Identifier 2:

SELECT * FROM p .. LIMIT 3;  p is a partitioned table.

Identifier 3:
SELECT * FROM p JOIN q using (partkey) LIMIT 3;

If we did the partition-wise-join, then we lost the chances for a better
plan.

Identifier 4:  -- EXISTS implies LIMIT 1;
SELECT * FROM foo
WHERE EXISTS
(SELECT 1 FROM append_rel_v_not_pullup_able WHERE xxx);

However, after I completed my patch and wanted to build some real
queries to prove my idea,  I just find it is hard to build the case for
Identifier 2/3/4.  But the Improvement for Identifier 1 is easy and
my real user case in work is Identifier 1 as well.

So a patch is attached for this case, it will use fractional costs
rather than total costs if needed. The following items needs more
attention during development.

- We shouldn't do the optimization if there are still more tables to join,
  the reason is similar to has_multiple_baserels(root) in
  set_subquery_pathlist. But the trouble here is we may inherit multiple
  levels to build an appendrel, so I have to keep the 'top_relids' all the
  time and compare it with PlannerInfo.all_baserels. If they are the same,
  then it is the case we want to optimize.

- add_partial_path doesn't consider the startup costs by design, I didn't
  rethink too much about this, but the idea of "choose a path which
  let each worker produces the top-N tuples as fast as possible" looks
  reasonable, and even though add_partial_path doesn't consider startup
  cost, it is still possible that childrel keeps more than 1 partial paths
  due to any reasons except startup_cost, for example PathKey. then we
  still have chances to choose the cheapest fractional path among
  them. The same strategy also applies to mixed partial and non-partial
  append paths.

- Due to the complexity of add_paths_to_append_rel, 3 arguments have
 to be added to get_cheapest_fractional_path...

  Path *
get_cheapest_fractional_path(RelOptInfo *rel, double tuple_fraction,
bool allow_parameterized, bool look_partial,
bool must_parallel_safe)


Cases can be improved.

Here is the simplest test case,  but it will not be hard to provide more
cases for Identifier 1.

 (select * from tenk1 order by hundred)
 UNION ALL
 (select * from tenk1 order by hundred)
 limit 3;

master: 8.096ms.
patched: 0.204ms.

The below user case should be more reasonable for real business.

with a as (select * from t1 join t2..),
b as (select * from t1 join t3 ..)
select * from a union all select * from b
limit 3;

The patch would also have impacts on identifier 2/3/4, even though I can't
make a demo sql which can get benefits from this patch,  I also  added
some test cases for code coverage purposes.

Any feedback is welcome!

[1]
https://www.postgresql.org/message-id/flat/CAKU4AWqEnzhUTxopVhENC3vs6NnYV32+e6GSBtp1rAv0ZNX=m...@mail.gmail.com


-- 
Best Regards
Andy Fan


v1-0001-make-add_paths_to_append_rel-aware-of-startup-cos.patch
Description: Binary data