Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

Etsuro Fujita Thu, 22 Oct 2015 01:12:09 -0700

On 2015/10/20 9:36, Kouhei Kaigai wrote:

Even if we fetch whole-row of both side, join pushdown is exactly working
because we can receive less number of rows than local join + 2 of foreign-
scan. (If planner works well, we can expect join-path that increases number
of rows shall be dropped.)


One downside of my proposition is growth of width for individual rows.
It is a trade-off situation. The above approach takes no changes for
existing EPQ infrastructure, thus, its implementation design is clear.
On the other hands, your approach will reduce traffic over the network,
however, it is still unclear how we integrate scanrelid==0 with EPQ
infrastructure.

I agree with KaiGai-san that his proposition (or my proposition based onsecondary plans) is still a performance improvement over the currentimplementation on local joining plus early row locking, since that thatwouldn't have to transfer useless data that didn't satisfy joinconditions at all!

On the other hands, in case of custom-scan that takes underlying local
scan-nodes, thus, any kind of ROW_MARK_* except for ROW_MARK_COPY will
happen. I think width of the joined tuples are relatively minor issue
than FDW cases. However, we cannot expect the fetched rows are protected
by early row-locking mechanism, so probability of re-fetching rows and
reconstruction of joined-tuple has relatively higher priority.


I see.

There is also some possible loss of efficiency with this approach.
Suppose that we have two tables ft1 and ft2 which are being joined,
and we push down the join.  They are being joined on an integer
column, and the join needs to select several other columns as well.
However, ft1 and ft2 are very wide tables that also contain some text
columns.   The query is like this:

SELECT localtab.a, ft1.p, ft2.p FROM localtab LEFT JOIN (ft1 JOIN ft2
ON ft1.x = ft2.x AND ft1.huge ~ 'stuff' AND f2.huge2 ~ 'nonsense') ON
localtab.q = ft1.q;

If we refetch each row individually, we will need a wholerow image of
ft1 and ft2 that includes all columns, or at least f1.huge and
f2.huge2.  If we just fetch a wholerow image of the join output, we
can exclude those.  The only thing we need to recheck is that it's
still the case that localtab.q = ft1.q (because the value of
localtab.q might have changed).

As KaiGai-san mentioned above, what we need to discuss more about withRobert's proposition is how to integrate that into the existing EPQmachinery. For example, when, where, and how should we refetch thewhole-row image of the join output in the case of late row locking? IMVI think that that would need to add a new FDW API different fromRefetchForeignRow, say RefetchForeignJoinRow.

IMO I think that another benefit from the proposition from KaiGai-san(or me) would be that that could provide the whole functionality for rowlocking in remote joins, without an additional development burden on anFDW author; the author only has to write GetForeignRowMarkType andRefetchForeignRow, which I think is relatively easy. I think that inthe proposition, the use of rowmark types such as ROW_MARK_SHARE orROW_MARK_EXCLUSIVE for foreign tables in remote joins would be quiteinefficient, but I think that the use of ROW_MARK_REFERENCE instead ofROW_MARK_COPY would be an option for the workload where EPQ rechecks arerarely invoked, because we just need to transfer ctids, not whole-rowimages.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

Reply via email to