On Fri, Dec 5, 2025 at 11:48 PM Robert Haas <[email protected]> wrote: > > On Wed, Dec 3, 2025 at 9:53 PM Richard Guo <[email protected]> wrote: > > But I admit that I am unsure if this addresses > > any real problems, so the effort might not be justified. I agree that > > maybe we should just go ahead with your current patch and see what > > happens. > > Thanks, I have committed that patch. So far the buildfarm seems OK...
Thanks a lot Robert and Richard. Sorry for responding so late. I had lost the context required to respond to the thread. I have fully caught up now and reviewed the commit. I repeated the experiments mentioned in [1] by Robert. I see the same results as Robert. I ran the queries multiple times with assert enabled dev binary as well as release binary with assertions disabled. The results are consistent across the runs and in both the cases the query was slower after enabling partitionwise join. This is a good reproduction, required as per [2]. I looked again at the queries I tried to reproduce this regression. Back then, I had tried many variations of the query mentioned in [4] using many partitions as well as large data sets. What stands out in Robert's query is it has 10000 * 10000 join, has an index on the column in join condition and that column is the sole column in the table; all factors contributing to the reproduction. I didn't go to that extreme. Thanks Robert for coming up with elusive reproduction. Generally the reworded comment is better than mine. However, I feel the connection between "rel only has Append and MergeAppend paths" and the condition IS_SIMPLE_REL(rel) is obscure; something that my wording made explicit. However, I don't think we need to change that now. > Another test case failure that would have happened had Ashutosh not > compensated for it in the patch is with this query: > > EXPLAIN (COSTS OFF) > SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER > BY x.id ASC LIMIT 10; > > Now, currently, this chooses a partitionwise-join. The Limit node at > the top of the plan tree has a cost of 2.22 and the underlying Merge > Append node has a cost of 223.11. But if you apply the fix and not > Ashutosh's adjustments to the test case, then you get a > non-partitionwise join, where the Limit at the top of the plan tree > has a cost of 2.21 and the underlying Merge Append node has a cost of > 223.10. Since we're just trying to decide whether to Append some Merge > Joins or Merge Join some Appends, it's not surprising that the costs > are very close. However, I also find it slightly discouraging in terms > of accepting Ashutosh's proposed fix. In the Q1 case, above, we > apparently reduce the cost specifically by not flushing the path list. > But here, we just end up picking a nearly equivalent path with a > nearly-equivalent cost. At least, that means the test case isn't > likely to be stable, and we could just patch around that, as Ashutosh > did, by suppressing partitionwise join (it is not clear whether this > compromises the goals of the test case, but it's not obvious that it > does). But it might also be taken as a worrying indication that plans > of this form are going to come out as either partitionwise or not > based on essentially random factors, which could be viewed as a flaw > in the approach. I'm not really sure which way to view it, and if is a > flaw in the approach, then I'm not sure what to do instead. This was discussed earlier in the thread around [5]. According to Arne those queries were added to make sure that the code added by 6b94e7a6da2f1c6df1a42efe64251f32a444d174 was getting exercised even in partitionwise join planning. Even with my adjustment to the test, it was getting exercised, so I thought it was ok to adjust the expected output instead of trying to make the query use partitionwise join. However, your fix is better since the query still uses partitionwise join and also exercises the required code. Thanks again. I verified that the code gets exercised under a debugger. My patch removed a redundant SET enable_partitionwise_join = on. That change is not included in your commit, I believe, because it's irrelevant to the fix. However, it's better to avoid the redundancy to avoid confusion. PFA patch for the same. [1] https://www.postgresql.org/message-id/CA%2BTgmoY9egbm1qehv%3DaSp%2BcwckOdbbdRePaXpBRSdL6PWDjvuQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/303774.1735851504%40sss.pgh.pa.us [3] https://www.postgresql.org/message-id/786.1565541557%40sss.pgh.pa.us [4] https://www.postgresql.org/message-id/CAKZiRmyaFFvxyEYGG_hu0F-EVEcqcnveH23MULhW6UY_jwykGw%40mail.gmail.com [5] https://www.postgresql.org/message-id/CAExHW5uZh5fJ2siyYpwVWUq%2Br5EW_gatE1WRZ9H56x%2B115CUCg%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
From 6e17012cfd7995e44f5a699d82dcc784c2337fd0 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <[email protected]> Date: Fri, 12 Dec 2025 17:43:01 +0530 Subject: [PATCH v20251212] Remove redundant SET enable_partitionwise_join = on statement partition_join.sql test enables partitionwise join using SET enable_partitionwise_join = on as the first statement. It is reset only at the end of the test. Hence another SET enable_partitionwise_join = on in between is not required. Author: Ashutosh Bapat <[email protected]> --- src/test/regress/expected/partition_join.out | 1 - src/test/regress/sql/partition_join.sql | 1 - 2 files changed, 2 deletions(-) diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out index 17d27ef3d46..290fe52902d 100644 --- a/src/test/regress/expected/partition_join.out +++ b/src/test/regress/expected/partition_join.out @@ -5198,7 +5198,6 @@ ANALYZE fract_t; -- (avoid merge joins, because the costs of partitionwise and non-partitionwise -- merge joins tend to be almost equal, and we want this test to be stable) SET max_parallel_workers_per_gather = 0; -SET enable_partitionwise_join = on; SET enable_mergejoin = off; EXPLAIN (COSTS OFF) SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10; diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql index d153297acba..19a6d19a99c 100644 --- a/src/test/regress/sql/partition_join.sql +++ b/src/test/regress/sql/partition_join.sql @@ -1225,7 +1225,6 @@ ANALYZE fract_t; -- (avoid merge joins, because the costs of partitionwise and non-partitionwise -- merge joins tend to be almost equal, and we want this test to be stable) SET max_parallel_workers_per_gather = 0; -SET enable_partitionwise_join = on; SET enable_mergejoin = off; EXPLAIN (COSTS OFF) base-commit: 493eb0da31be4520252e1af723342dc7ead0c3e5 -- 2.34.1
