On Thu, 1 Aug 2024 at 04:23, Robert Haas <robertmh...@gmail.com> wrote: > OK, here's a new patch version.
I think we're going down the right path here. I've reviewed both patches, here's what I noted down during my review: 0. I've not seen any mention so far about postgres_fdw's use_remote_estimate. Maybe changing the costs is fixing an issue that existed before. I'm just not 100% sure on that. Consider: CREATE EXTENSION postgres_fdw; DO $d$ BEGIN EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (use_remote_estimate 'true', dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$' )$$; END; $d$; CREATE USER MAPPING FOR CURRENT_USER SERVER loopback; create table t (a int); create foreign table ft (a int) server loopback OPTIONS (table_name 't'); alter system set enable_seqscan=0; select pg_Reload_conf(); set enable_seqscan=1; explain select * from ft; patched: Foreign Scan on ft (cost=100.00..671.00 rows=2550 width=4) master: Foreign Scan on ft (cost=10000000100.00..10000000671.00 rows=2550 width=4) I kinda think that might be fixing an issue that I don't recall being reported before. I think we shouldn't really care that much about what nodes are disabled on the remote server and not having disabled_cost applied to that gives us that. 1. The final sentence of the function header comment needs to be updated in estimate_path_cost_size(). 2. Does cost_tidscan() need to update the header comment to say tidquals must not be empty? 3. final_cost_nestloop() seems to initially use the disabled_nodes from initial_cost_nestloop() but then it goes off and calculates it again itself. One of these seems redundant. The "We could include disable_cost in the preliminary estimate" comment explains why it was originally left to final_cost_nestloop(), so maybe worth sticking to that? I don't quite know the full implications, but it does not seem worth risking a behaviour change here. 4. I wonder if it's worth doing a quick refactor of the code in initial_cost_mergejoin() to get rid of the duplicate code in the "if (outersortkeys)" and "if (innersortkeys)" branches. It seems ok to do outer_path = &sort_path. Likewise for inner_path. 5. final_cost_hashjoin() does the same thing as #3 6. createplan.c adds #include "nodes/print.h" but doesn't seem to add any code that might use anything in there. 7. create_lockrows_path() needs to propagate disabled_nodes. create table a (a int); set enable_seqscan=0; explain select * from a for update limit 1; Limit (cost=0.00..0.02 rows=1 width=10) -> LockRows (cost=0.00..61.00 rows=2550 width=10) -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=10) Disabled Nodes: 1 (4 rows) explain select * from a limit 1; Limit (cost=0.00..0.01 rows=1 width=4) Disabled Nodes: 1 -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4) Disabled Nodes: 1 (4 rows) 8. There's something weird with CTEs too. create table b(a int); set enable_sort=0; Patched: explain with cte as materialized (select * from b order by a) select * from cte order by a desc; Sort (cost=381.44..387.82 rows=2550 width=4) Disabled Nodes: 1 Sort Key: cte.a DESC CTE cte -> Sort (cost=179.78..186.16 rows=2550 width=4) Disabled Nodes: 1 Sort Key: b.a -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4) -> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4) (9 rows) master: explain with cte as materialized (select * from a order by a) select * from cte order by a desc; Sort (cost=20000000381.44..20000000387.82 rows=2550 width=4) Sort Key: cte.a DESC CTE cte -> Sort (cost=10000000179.78..10000000186.16 rows=2550 width=4) Sort Key: a.a -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4) -> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4) (7 rows) I'd expect the final sort to have disabled_nodes == 2 since disabled_cost has been added twice in master. 9. create_set_projection_path() needs to propagate disabled_nodes too: explain select b from (select a,generate_series(1,2) as b from b) a limit 1; Limit (cost=0.00..0.03 rows=1 width=4) -> Subquery Scan on a (cost=0.00..131.12 rows=5100 width=4) -> ProjectSet (cost=0.00..80.12 rows=5100 width=8) -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=0) Disabled Nodes: 1 10. create_setop_path() needs to propagate disabled_nodes. explain select * from b except select * from b limit 1; Limit (cost=0.00..0.80 rows=1 width=8) -> HashSetOp Except (cost=0.00..160.25 rows=200 width=8) -> Append (cost=0.00..147.50 rows=5100 width=8) Disabled Nodes: 2 -> Subquery Scan on "*SELECT* 1" (cost=0.00..61.00 rows=2550 width=8) Disabled Nodes: 1 -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4) Disabled Nodes: 1 -> Subquery Scan on "*SELECT* 2" (cost=0.00..61.00 rows=2550 width=8) Disabled Nodes: 1 -> Seq Scan on b b_1 (cost=0.00..35.50 rows=2550 width=4) Disabled Nodes: 1 (12 rows) 11. create_modifytable_path() needs to propagate disabled_nodes. explain with cte as (update b set a = a+1 returning *) select * from cte limit 1; Limit (cost=41.88..41.90 rows=1 width=4) CTE cte -> Update on b (cost=0.00..41.88 rows=2550 width=10) -> Seq Scan on b (cost=0.00..41.88 rows=2550 width=10) Disabled Nodes: 1 -> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4) (6 rows) 12. For the 0002 patch, I do agree that having this visible in EXPLAIN is a must. I'd much rather see: Disabled: true/false. And just display this when the disabled_nodes is greater than the sum of the subpaths. That might be much more complex to implement, but it's going to make it much easier to track down the disabled nodes in very large plans. David