On 2015/09/09 3:53, Robert Haas wrote:
On Tue, Sep 8, 2015 at 5:35 AM, Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
On 2015/09/04 0:33, Robert Haas wrote:
I'm worried that trawling through that
SpecialJoinInfo data will end up needing to duplicate much of
make_join_rel and add_paths_to_joinrel.  For example, consider:

SELECT * FROM verysmall v JOIN (bigft1 FULL JOIN bigft2 ON bigft1.x =
bigft2.x) ON v.q = bigft1.q AND v.r = bigft2.r;

The best path for this plan is presumably something like this:

Nested Loop
-> Seq Scan on verysmall v
-> Foreign Scan on bigft1 and bigft2
      Remote SQL: SELECT * FROM bigft1 FULL JOIN bigft2 ON bigft1.x =
bigft2.x AND bigft1.q = $1 AND bigft2.r = $2

Now, how is the FDW going to figure out that it needs to generate this
parameterized path without duplicating this code from
add_paths_to_joinrel?

      /*
       * Decide whether it's sensible to generate parameterized paths for
this
       * joinrel, and if so, which relations such paths should require.
There
       * is usually no need to create a parameterized result path unless
there
...

Maybe there's a very simple answer to this question and I'm just not
seeing it, but I really don't see how that's going to work.

Why don't you look at the "regular" (local join execution) paths that were
already generated.  I think that if we called the FDW at a proper hook
location, the FDW could probably find a regular path in rel->pathlist of the
join rel (bigft1, bigft2) that possibly generates something like:

Nested Loop
   -> Seq Scan on verysmall v
   -> Nested Loop
        Join Filter: (bigft1.a = bigft2.a)
        -> Foreign Scan on bigft1
             Remote SQL: SELECT * FROM bigft1 WHERE bigft1.q = $1
        -> Foreign Scan on bigft2
             Remote SQL: SELECT * FROM bigft2 WHERE bigft2.r = $2

 From the parameterization of the regular nestloop path for joining bigft1
and bigft2 locally, I think that the FDW could find that it's sensible to
generate the foreign-join path for (bigft1, bigft2) with the
parameterization.

But that path might have already been discarded on the basis of cost.
I think Tom's idea is better: let the FDW consult some state cached
for this purpose in the RelOptInfo.

Do you have an idea of what information would be collected into the state and how the FDW would derive parameterizations to consider producing pushed-down joins with from that information? What I'm concerned about that is to reduce the number of parameterizations to consider, to reduce overhead in costing the corresponding queries. I'm missing something, though.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to