Re: [HACKERS] PoC: full merge join on comparison clause
Hi, I am attaching the updated patch, rebased to 820c03. On 09.10.2017 13:47, Ashutosh Bapat wrote: Hi Alexander, Commit c7a9fa399 has added another test on mergeopfamilies. I think your patch will need to take care of that test. On Wed, Oct 4, 2017 at 6:38 PM, Alexander Kuzmenkovwrote: As discussed earlier, I changed the way we work with mergeopfamilies. I use the "is_equality" flag to indicate whether the clause is an equality one, and fill mergeopfamilies for both equality and inequality operators. The updated patch is attached (rebased to 20b6552242). -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c index 925b4cf553..73e6a4ca74 100644 --- a/src/backend/executor/nodeMergejoin.c +++ b/src/backend/executor/nodeMergejoin.c @@ -172,31 +172,32 @@ typedef enum * to the opfamily and collation, with nulls at the indicated end of the range. * This allows us to obtain the needed comparison function from the opfamily. */ -static MergeJoinClause +static void MJExamineQuals(List *mergeclauses, Oid *mergefamilies, Oid *mergecollations, int *mergestrategies, bool *mergenullsfirst, - PlanState *parent) + MergeJoinState *parent) { - MergeJoinClause clauses; int nClauses = list_length(mergeclauses); int iClause; ListCell *cl; - clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_UseEqual = (bool *) palloc0(nClauses * sizeof(bool)); + parent->mj_UseLesser = (bool *) palloc0(nClauses * sizeof(bool)); iClause = 0; foreach(cl, mergeclauses) { OpExpr *qual = (OpExpr *) lfirst(cl); - MergeJoinClause clause = [iClause]; + MergeJoinClause clause = >mj_Clauses[iClause]; Oid opfamily = mergefamilies[iClause]; Oid collation = mergecollations[iClause]; - StrategyNumber opstrategy = mergestrategies[iClause]; + StrategyNumber sort_op_strategy = mergestrategies[iClause]; bool nulls_first = mergenullsfirst[iClause]; - int op_strategy; + int join_op_strategy; Oid op_lefttype; Oid op_righttype; Oid sortfunc; @@ -207,28 +208,55 @@ MJExamineQuals(List *mergeclauses, /* * Prepare the input expressions for execution. */ - clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), parent); - clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), parent); + clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), (PlanState *) parent); + clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), (PlanState *) parent); /* Set up sort support data */ clause->ssup.ssup_cxt = CurrentMemoryContext; clause->ssup.ssup_collation = collation; - if (opstrategy == BTLessStrategyNumber) + if (sort_op_strategy == BTLessStrategyNumber) clause->ssup.ssup_reverse = false; - else if (opstrategy == BTGreaterStrategyNumber) + else if (sort_op_strategy == BTGreaterStrategyNumber) clause->ssup.ssup_reverse = true; else /* planner screwed up */ - elog(ERROR, "unsupported mergejoin strategy %d", opstrategy); + elog(ERROR, "unsupported mergejoin strategy %d", sort_op_strategy); clause->ssup.ssup_nulls_first = nulls_first; /* Extract the operator's declared left/right datatypes */ get_op_opfamily_properties(qual->opno, opfamily, false, - _strategy, + _op_strategy, _lefttype, _righttype); - if (op_strategy != BTEqualStrategyNumber) /* should not happen */ - elog(ERROR, "cannot merge using non-equality operator %u", - qual->opno); + + /* + * Determine whether we accept lesser and/or equal tuples of the inner + * relation. + */ + switch (join_op_strategy) + { + case BTEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +break; + + case BTLessEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +/* fall through */ + + case BTLessStrategyNumber: +parent->mj_UseLesser[iClause] = true; +break; + + case BTGreaterEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +/* fall through */ + + case BTGreaterStrategyNumber: +parent->mj_UseLesser[iClause] = true; +break; + + default: +elog(ERROR, "unsupported join strategy %d", join_op_strategy); + } /* * sortsupport routine must know if abbreviation optimization is @@ -265,8 +293,6 @@ MJExamineQuals(List *mergeclauses, iClause++; } - - return clauses; } /* @@ -378,6 +404,14 @@ MJEvalInnerValues(MergeJoinState *mergestate, TupleTableSlot *innerslot) return result; } +/* Tuple comparison result */ +typedef enum +{ + MJCR_NextInner = 1, + MJCR_NextOuter = -1, + MJCR_Join = 0 +}
Re: [HACKERS] PoC: full merge join on comparison clause
Hi Alexander, Commit c7a9fa399 has added another test on mergeopfamilies. I think your patch will need to take care of that test. On Wed, Oct 4, 2017 at 6:38 PM, Alexander Kuzmenkovwrote: > As discussed earlier, I changed the way we work with mergeopfamilies. I use > the "is_equality" flag to indicate whether the clause is an equality one, > and fill mergeopfamilies for both equality and inequality operators. > The updated patch is attached (rebased to 20b6552242). > > > -- > Alexander Kuzmenkov > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: full merge join on comparison clause
As discussed earlier, I changed the way we work with mergeopfamilies. I use the "is_equality" flag to indicate whether the clause is an equality one, and fill mergeopfamilies for both equality and inequality operators. The updated patch is attached (rebased to 20b6552242). -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c index 925b4cf553..73e6a4ca74 100644 --- a/src/backend/executor/nodeMergejoin.c +++ b/src/backend/executor/nodeMergejoin.c @@ -172,31 +172,32 @@ typedef enum * to the opfamily and collation, with nulls at the indicated end of the range. * This allows us to obtain the needed comparison function from the opfamily. */ -static MergeJoinClause +static void MJExamineQuals(List *mergeclauses, Oid *mergefamilies, Oid *mergecollations, int *mergestrategies, bool *mergenullsfirst, - PlanState *parent) + MergeJoinState *parent) { - MergeJoinClause clauses; int nClauses = list_length(mergeclauses); int iClause; ListCell *cl; - clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_UseEqual = (bool *) palloc0(nClauses * sizeof(bool)); + parent->mj_UseLesser = (bool *) palloc0(nClauses * sizeof(bool)); iClause = 0; foreach(cl, mergeclauses) { OpExpr *qual = (OpExpr *) lfirst(cl); - MergeJoinClause clause = [iClause]; + MergeJoinClause clause = >mj_Clauses[iClause]; Oid opfamily = mergefamilies[iClause]; Oid collation = mergecollations[iClause]; - StrategyNumber opstrategy = mergestrategies[iClause]; + StrategyNumber sort_op_strategy = mergestrategies[iClause]; bool nulls_first = mergenullsfirst[iClause]; - int op_strategy; + int join_op_strategy; Oid op_lefttype; Oid op_righttype; Oid sortfunc; @@ -207,28 +208,55 @@ MJExamineQuals(List *mergeclauses, /* * Prepare the input expressions for execution. */ - clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), parent); - clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), parent); + clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), (PlanState *) parent); + clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), (PlanState *) parent); /* Set up sort support data */ clause->ssup.ssup_cxt = CurrentMemoryContext; clause->ssup.ssup_collation = collation; - if (opstrategy == BTLessStrategyNumber) + if (sort_op_strategy == BTLessStrategyNumber) clause->ssup.ssup_reverse = false; - else if (opstrategy == BTGreaterStrategyNumber) + else if (sort_op_strategy == BTGreaterStrategyNumber) clause->ssup.ssup_reverse = true; else /* planner screwed up */ - elog(ERROR, "unsupported mergejoin strategy %d", opstrategy); + elog(ERROR, "unsupported mergejoin strategy %d", sort_op_strategy); clause->ssup.ssup_nulls_first = nulls_first; /* Extract the operator's declared left/right datatypes */ get_op_opfamily_properties(qual->opno, opfamily, false, - _strategy, + _op_strategy, _lefttype, _righttype); - if (op_strategy != BTEqualStrategyNumber) /* should not happen */ - elog(ERROR, "cannot merge using non-equality operator %u", - qual->opno); + + /* + * Determine whether we accept lesser and/or equal tuples of the inner + * relation. + */ + switch (join_op_strategy) + { + case BTEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +break; + + case BTLessEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +/* fall through */ + + case BTLessStrategyNumber: +parent->mj_UseLesser[iClause] = true; +break; + + case BTGreaterEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +/* fall through */ + + case BTGreaterStrategyNumber: +parent->mj_UseLesser[iClause] = true; +break; + + default: +elog(ERROR, "unsupported join strategy %d", join_op_strategy); + } /* * sortsupport routine must know if abbreviation optimization is @@ -265,8 +293,6 @@ MJExamineQuals(List *mergeclauses, iClause++; } - - return clauses; } /* @@ -378,6 +404,14 @@ MJEvalInnerValues(MergeJoinState *mergestate, TupleTableSlot *innerslot) return result; } +/* Tuple comparison result */ +typedef enum +{ + MJCR_NextInner = 1, + MJCR_NextOuter = -1, + MJCR_Join = 0 +} MJCompareResult; + /* * MJCompare * @@ -388,10 +422,10 @@ MJEvalInnerValues(MergeJoinState *mergestate, TupleTableSlot *innerslot) * MJEvalOuterValues and MJEvalInnerValues must already have been called * for the current outer and inner tuples, respectively. */ -static int +static MJCompareResult MJCompare(MergeJoinState *mergestate) { - int result = 0; + MJCompareResult result = MJCR_Join; bool nulleqnull = false;
Re: [HACKERS] PoC: full merge join on comparison clause
On Thu, Sep 28, 2017 at 8:57 PM, Alexander Kuzmenkovwrote: > Hi Ashutosh, > > Thanks for the review. > > Jeff, I'm copying you because this is relevant to our discussion about what > to do with mergeopfamilies when adding new merge join types. > > You have renamed RestrictInfo member mergeopfamilies as > equivopfamilies. I don't think that's a good name; it doesn't convey > that these are opfamilies containing merge operators. The changes in > check_mergejoinable() suggest that an operator may act as equality > operator in few operator families and comparison operator in others. > That looks odd. Actually an operator family contains operators other > than equality operators, so you may want to retain this member and add > a new member to specify whether the clause is an equality clause or > not. > > > For mergeopfamilies, I'm not sure what is the best thing to do. I'll try to > explain my understanding of the situation, please correct me if I'm wrong. > > Before the patch, mergeopfamilies was used for two things: creating > equivalence classes and performing merge joins. > > For equivalence classes: we look at the restriction clauses, and if they > have mergeopfamilies set, it means that these clause are based on an > equality operator, and the left and right variables must be equal. To record > this fact, we create an equivalence class. The variables might be equal for > one equality operator and not equal for another, so we record the particular > operator families to which our equality operator belongs. > > For merge join: we look at the join clauses, and if they have > mergeopfamilies set, it means that these clauses are based on an equality > operator, and we can try performing this particular join as merge join. > These opfamilies are also used beforehand to create the equivalence classes > for left and right variables. The equivalence classes are used to match the > join clauses to pathkeys describing the ordering of join inputs. > > So, if we want to start doing merge joins for operators other than equality, > we still need to record their opfamilies, but only use them for the second > case and not the first. I chose to put these opfamilies to different > variables, and > name the one used for equivalence classes 'equivopfamilies' and the one used > for merge joins 'mergeopfamilies'. The equality operators are used for both > cases, so we put their opfamilies into both of these variables. > > I agree this might look confusing. Indeed, we could keep a single variable > for opfamilies, and add separate flags that show how they can be used, be > that for equivalence classes, merge joins, range joins or some combination > of them. This is similar to what Jeff did in his range merge join patch [1]. > I will think more about this and try to produce an updated patch. > I think we have (ab?)used mergeopfamilies to indicate equality condition, which needs some changes. May be these two patches are where we can do those changes. > > In mergejoinscansel() you have just removed Assert(op_strategy == > BTEqualStrategyNumber); Probably this function is written considering > on equality operators. But now that we are using this for all other > operators, we will need more changes to this function. That may be the > reason why INNER join in your earlier example doesn't choose right > costing. > > > I changed mergejoinscansel() slightly to reflect the fact that the inner > relation is scanned from the beginning if we have an inequality merge > clause. > > > The comment change in final_cost_mergejoin() needs more work. n1, n2, > n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 + > n2 + n3 + ... = size of inner relation is correct. In that context I > am not able to understand your change > +* If the merge clauses contain inequality, (n1 + n2 + ...) ~= > +* (size of inner relation)^2. > > > I extended the comment in final_cost_mergejoin(). Not sure if that > approximation makes any sense, but this is the best I could think of. > > Style problems are fixed. > > Attached please find the new version of the patch that addresses all the > review comments except mergeopfamilies. > > The current commitfest is ending, but I'd like to continue working on this > patch, so I am moving it to the next one. > > Thanks for working on the comments. I am interested to continue reviewing it in the next commitfest. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: full merge join on comparison clause
Hi Ashutosh, Thanks for the review. *Jeff*, I'm copying you because this is relevant to our discussion about what to do with mergeopfamilies when adding new merge join types. You have renamed RestrictInfo member mergeopfamilies as equivopfamilies. I don't think that's a good name; it doesn't convey that these are opfamilies containing merge operators. The changes in check_mergejoinable() suggest that an operator may act as equality operator in few operator families and comparison operator in others. That looks odd. Actually an operator family contains operators other than equality operators, so you may want to retain this member and add a new member to specify whether the clause is an equality clause or not. For mergeopfamilies, I'm not sure what is the best thing to do. I'll try to explain my understanding of the situation, please correct me if I'm wrong. Before the patch, mergeopfamilies was used for two things: creating equivalence classes and performing merge joins. For equivalence classes: we look at the restriction clauses, and if they have mergeopfamilies set, it means that these clause are based on an equality operator, and the left and right variables must be equal. To record this fact, we create an equivalence class. The variables might be equal for one equality operator and not equal for another, so we record the particular operator families to which our equality operator belongs. For merge join: we look at the join clauses, and if they have mergeopfamilies set, it means that these clauses are based on an equality operator, and we can try performing this particular join as merge join. These opfamilies are also used beforehand to create the equivalence classes for left and right variables. The equivalence classes are used to match the join clauses to pathkeys describing the ordering of join inputs. So, if we want to start doing merge joins for operators other than equality, we still need to record their opfamilies, but only use them for the second case and not the first. I chose to put these opfamilies to different variables, and name the one used for equivalence classes 'equivopfamilies' and the one used for merge joins 'mergeopfamilies'. The equality operators are used for both cases, so we put their opfamilies into both of these variables. I agree this might look confusing. Indeed, we could keep a single variable for opfamilies, and add separate flags that show how they can be used, be that for equivalence classes, merge joins, range joins or some combination of them. This is similar to what Jeff did in his range merge join patch [1]. I will think more about this and try to produce an updated patch. In mergejoinscansel() you have just removed Assert(op_strategy == BTEqualStrategyNumber); Probably this function is written considering on equality operators. But now that we are using this for all other operators, we will need more changes to this function. That may be the reason why INNER join in your earlier example doesn't choose right costing. I changed mergejoinscansel() slightly to reflect the fact that the inner relation is scanned from the beginning if we have an inequality merge clause. The comment change in final_cost_mergejoin() needs more work. n1, n2, n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 + n2 + n3 + ... = size of inner relation is correct. In that context I am not able to understand your change +* If the merge clauses contain inequality, (n1 + n2 + ...) ~= +* (size of inner relation)^2. I extended the comment in final_cost_mergejoin(). Not sure if that approximation makes any sense, but this is the best I could think of. Style problems are fixed. Attached please find the new version of the patch that addresses all the review comments except mergeopfamilies. The current commitfest is ending, but I'd like to continue working on this patch, so I am moving it to the next one. [1] https://www.postgresql.org/message-id/flat/CAMp0ubfwAFFW3O_NgKqpRPmm56M4weTEXjprb2gP_NrDaEC4Eg%40mail.gmail.com#camp0ubfwaffw3o_ngkqprpmm56m4wetexjprb2gp_nrdaec...@mail.gmail.com -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 32dc4e6301..2958a9e53d 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -722,19 +722,19 @@ get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel) { RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(lc); - /* Consider only mergejoinable clauses */ - if (restrictinfo->mergeopfamilies == NIL) + /* Consider only mergejoinable equality clauses */ + if (restrictinfo->equivopfamilies == NIL) continue; /* Make sure we've got canonical ECs. */ - update_mergeclause_eclasses(root, restrictinfo); + update_equivclause_eclasses(root, restrictinfo); /* - *
Re: [HACKERS] PoC: full merge join on comparison clause
> On 19 Sep 2017, at 15:18, Ashutosh Bapat> wrote: > > On Fri, Aug 25, 2017 at 10:11 PM, Alexander Kuzmenkov > wrote: >> Here is a new version of the patch, rebased to 749c7c41 and with some >> cosmetic changes. >> > > I looked at this patch briefly. This is a useful feature. This isn't a > design level review of the patch. I may get back to that later. But > here are some assorted comments > .. Looking forward to further review on this patch, but based on this feedback I’m moving this to Waiting for author. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: full merge join on comparison clause
Hi Alexander, On Fri, Aug 25, 2017 at 10:11 PM, Alexander Kuzmenkovwrote: > Here is a new version of the patch, rebased to 749c7c41 and with some > cosmetic changes. > I looked at this patch briefly. This is a useful feature. This isn't a design level review of the patch. I may get back to that later. But here are some assorted comments The patch applies cleanly, but there are some whitespace errors. src/backend/executor/nodeMergejoin.c:231: trailing whitespace. + /* src/backend/executor/nodeMergejoin.c:232: trailing whitespace. +* Determine whether we accept lesser and/or equal tuples src/backend/optimizer/path/joinpath.c:499: trailing whitespace. + * a merge join. A merge join executor can only choose inner values that are src/backend/optimizer/path/joinpath.c:501: trailing whitespace. + * have at most one non-equality clause. The implementation may change, so fixing the white space errors may not be priority now. The patch compiles cleanly. You have renamed RestrictInfo member mergeopfamilies as equivopfamilies. I don't think that's a good name; it doesn't convey that these are opfamilies containing merge operators. The changes in check_mergejoinable() suggest that an operator may act as equality operator in few operator families and comparison operator in others. That looks odd. Actually an operator family contains operators other than equality operators, so you may want to retain this member and add a new member to specify whether the clause is an equality clause or not. In mergejoinscansel() you have just removed Assert(op_strategy == BTEqualStrategyNumber); Probably this function is written considering on equality operators. But now that we are using this for all other operators, we will need more changes to this function. That may be the reason why INNER join in your earlier example doesn't choose right costing. The comment change in final_cost_mergejoin() needs more work. n1, n2, n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 + n2 + n3 + ... = size of inner relation is correct. In that context I am not able to understand your change +* If the merge clauses contain inequality, (n1 + n2 + ...) ~= +* (size of inner relation)^2. Some stylistic comments + switch (join_op_strategy) + { + case BTEqualStrategyNumber: + parent->mj_UseEqual[iClause] = true; + break; + case BTLessEqualStrategyNumber: + parent->mj_UseEqual[iClause] = true; + /* fall through */ + case BTLessStrategyNumber: + parent->mj_UseLesser[iClause] = true; + break; + case BTGreaterEqualStrategyNumber: + parent->mj_UseEqual[iClause] = true; + /* fall through */ + case BTGreaterStrategyNumber: + parent->mj_UseLesser[iClause] = true; + break; + default: + Assert(false); Add blank lines between different cases and you may want to replace Assert in default case into an elog(). See for example exprType() or get_jointype_name(). + if (sort_result < 0) + { + result = MJCR_NextOuter; + } We usually do not add {} around a single statement block. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: full merge join on comparison clause
Here is a new version of the patch, rebased to 749c7c41 and with some cosmetic changes. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index d77c2a70e4..19bc90aa32 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -722,19 +722,19 @@ get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel) { RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(lc); - /* Consider only mergejoinable clauses */ - if (restrictinfo->mergeopfamilies == NIL) + /* Consider only mergejoinable equality clauses */ + if (restrictinfo->equivopfamilies == NIL) continue; /* Make sure we've got canonical ECs. */ - update_mergeclause_eclasses(root, restrictinfo); + update_equivclause_eclasses(root, restrictinfo); /* - * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee + * restrictinfo->equivopfamilies != NIL is sufficient to guarantee * that left_ec and right_ec will be initialized, per comments in * distribute_qual_to_rels. * - * We want to identify which side of this merge-joinable clause + * We want to identify which side of this merge-joinable equality clause * contains columns from the relation produced by this RelOptInfo. We * test for overlap, not containment, because there could be extra * relations on either side. For example, suppose we've got something diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c index 925b4cf553..8eb5c8fd1d 100644 --- a/src/backend/executor/nodeMergejoin.c +++ b/src/backend/executor/nodeMergejoin.c @@ -172,31 +172,32 @@ typedef enum * to the opfamily and collation, with nulls at the indicated end of the range. * This allows us to obtain the needed comparison function from the opfamily. */ -static MergeJoinClause +static void MJExamineQuals(List *mergeclauses, Oid *mergefamilies, Oid *mergecollations, int *mergestrategies, bool *mergenullsfirst, - PlanState *parent) + MergeJoinState *parent) { - MergeJoinClause clauses; int nClauses = list_length(mergeclauses); int iClause; ListCell *cl; - clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData)); + parent->mj_UseEqual = (bool *) palloc0(nClauses * sizeof(bool)); + parent->mj_UseLesser = (bool *) palloc0(nClauses * sizeof(bool)); iClause = 0; foreach(cl, mergeclauses) { OpExpr *qual = (OpExpr *) lfirst(cl); - MergeJoinClause clause = [iClause]; + MergeJoinClause clause = >mj_Clauses[iClause]; Oid opfamily = mergefamilies[iClause]; Oid collation = mergecollations[iClause]; - StrategyNumber opstrategy = mergestrategies[iClause]; + StrategyNumber sort_op_strategy = mergestrategies[iClause]; bool nulls_first = mergenullsfirst[iClause]; - int op_strategy; + int join_op_strategy; Oid op_lefttype; Oid op_righttype; Oid sortfunc; @@ -207,28 +208,50 @@ MJExamineQuals(List *mergeclauses, /* * Prepare the input expressions for execution. */ - clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), parent); - clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), parent); + clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), (PlanState *) parent); + clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), (PlanState *) parent); /* Set up sort support data */ clause->ssup.ssup_cxt = CurrentMemoryContext; clause->ssup.ssup_collation = collation; - if (opstrategy == BTLessStrategyNumber) + if (sort_op_strategy == BTLessStrategyNumber) clause->ssup.ssup_reverse = false; - else if (opstrategy == BTGreaterStrategyNumber) + else if (sort_op_strategy == BTGreaterStrategyNumber) clause->ssup.ssup_reverse = true; else /* planner screwed up */ - elog(ERROR, "unsupported mergejoin strategy %d", opstrategy); + elog(ERROR, "unsupported mergejoin strategy %d", sort_op_strategy); clause->ssup.ssup_nulls_first = nulls_first; /* Extract the operator's declared left/right datatypes */ get_op_opfamily_properties(qual->opno, opfamily, false, - _strategy, + _op_strategy, _lefttype, _righttype); - if (op_strategy != BTEqualStrategyNumber) /* should not happen */ - elog(ERROR, "cannot merge using non-equality operator %u", - qual->opno); + + /* + * Determine whether we accept lesser and/or equal tuples + * of the inner relation. + */ + switch (join_op_strategy) + { + case BTEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +break; + case BTLessEqualStrategyNumber: +parent->mj_UseEqual[iClause] = true; +/* fall through */ + case BTLessStrategyNumber: +parent->mj_UseLesser[iClause] = true; +break;
Re: [HACKERS] PoC: full merge join on comparison clause
On 16.05.2017 18:57, Robert Haas wrote: Interesting. I suggest adding this to the next CommitFest. Thank you, added: https://commitfest.postgresql.org/14/1141/ -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: full merge join on comparison clause
On Fri, May 12, 2017 at 7:09 AM, Alexander Kuzmenkovwrote: > As you know, at this time Postgres cannot perform a full join on a > comparison clause. For example, if we have two tables with numeric columns > and run a query like 'select * from t1 full join t2 on t1.a > t2.a', we get > an error: "FULL JOIN is only supported with merge-joinable or hash-joinable > join conditions". Such queries are legitimate SQL and sometimes arise when > porting applications from different DBMS, so it would be good to support > them in Postgres. They can be rewritten as union of right and left joins, > but it requires manual work and runs somewhat slower (see the benchmark at > the end of the letter). This proof-of-concept patch explores the possibility > of performing such queries as merge joins. Interesting. I suggest adding this to the next CommitFest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers