Possible documentation inaccuracy in optimizer README

Zeyuan Hu Mon, 07 Apr 2025 19:32:56 -0700

Hello,

In https://github.com/postgres/postgres/tree/master/src/backend/optimizer,
there are two examples on the dynamic programming (DP) algorithm used in
the optimizer, which I think
have some inaccuracy:


SELECT  *
    FROM    tab1, tab2, tab3, tab4
    WHERE   tab1.col = tab2.col AND
        tab2.col = tab3.col AND
        tab3.col = tab4.col

    Tables 1, 2, 3, and 4 are joined as:
    {1 2},{2 3},{3 4}
    {1 2 3},{2 3 4}
    {1 2 3 4}
    (other possibilities will be excluded for lack of join clauses)

    SELECT  *
    FROM    tab1, tab2, tab3, tab4
    WHERE   tab1.col = tab2.col AND
        tab1.col = tab3.col AND
        tab1.col = tab4.col

    Tables 1, 2, 3, and 4 are joined as:
    {1 2},{1 3},{1 4}
    {1 2 3},{1 3 4},{1 2 4}
    {1 2 3 4}

Both examples seem to ignore the transitivity imposed by the join clauses,
which creates a gap between the documentation here and what the actual
optimizer does.
For the first example, in the actual optimizer, {1,3},{1,4},{2,4} are also
considered. For the second example, {2,3}, {2,4}, {3,4} are also
considered.
The reason is that, for example 1, tab1.col = tab2.col AND tab2.col =
tab3.col imply that tab1.col = tab3.col due to transitivity, which leads to
the consideration of {1,3}.
The rest of the missing entries follow the same reasoning.

best
regards,


Zeyuan

Possible documentation inaccuracy in optimizer README

Reply via email to