Re: [HACKERS] PoC: full merge join on comparison clause

2018-11-21 Thread Alexander Kuzmenkov

On 11/19/18 04:46, Tom Lane wrote:

In short, proceeding like the above when we can't find another plan
type for a full join seems like it fixes a far wider variety of cases.
The possibility that maybe we could do some of those cases a bit faster
isn't sufficiently attractive to me to justify also putting in a
mechanism like this patch proposes.  We only rarely see complaints at
all about can't-do-a-full-join problems, and I do not think this patch
would fix enough of those complaints to be worthwhile.



I agree, the automated UNION substitutions seems to be a better 
approach. I'll mark this patch as rejected then.



--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] PoC: full merge join on comparison clause

2018-11-18 Thread Tom Lane
Alexander Kuzmenkov  writes:
> [ Inequality-merge-join-v10.patch ]

Just thinking about this patch a bit ... I wonder why you were so quick to
reject the UNION approach at the outset.  This patch is pretty messy, and
it complicates a lot of stuff that is quite fundamental to the planner,
and you still end up that the only functionality gain is now we can handle
full joins whose conditions include a single btree inequality clause.
Nor are we doing that remarkably efficiently ... it's pretty much
impossible to do it efficiently, in fact, since if the inputs have M and N
rows respectively then the output will have something like (M*N)/2 rows.

So it seems to me that if we're going to put sweat into this area at all,
our ambition ought to be "we'll successfully perform a FULL JOIN with any
join clause whatsoever, though it might take O(M*N) time".

Now as far as I can tell, the UNION substitution you proposed is
completely valid, although it'd be better to phrase the second step
as an antijoin.  That is, I believe

select * from t1 full join t2 on (anything)

is exactly equal to

select t1.*, t2.* from t1 left join t2 on (anything)
union all
select t1.*, t2.* from t2 anti join t1 on (anything)

There is one fly in the ointment, which is that we will have to run the
join clause twice, so it can't contain volatile functions --- but the
merge join approach wouldn't handle that case either.

Having to read the inputs twice is not good, but we could put them
into CTEs, which fixes any problems with volatility below the join
and at least alleviates the performance problem.  Since we can't
currently do any meaningful qual pushdown through full joins, the
optimization-fence aspect of a CTE doesn't seem like an issue either.

In short, proceeding like the above when we can't find another plan
type for a full join seems like it fixes a far wider variety of cases.
The possibility that maybe we could do some of those cases a bit faster
isn't sufficiently attractive to me to justify also putting in a
mechanism like this patch proposes.  We only rarely see complaints at
all about can't-do-a-full-join problems, and I do not think this patch
would fix enough of those complaints to be worthwhile.

regards, tom lane



Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-30 Thread Alexander Kuzmenkov

El 18/07/18 a las 16:58, Ashutosh Bapat escribió:


Thanks for the commit messages. I would use word "in-equality" instead
of "comparison" since equality is also a comparison.


Fixed.


Comparing this with the original code, I think, is_mj_equality should be true
if restrictinfo->mergeopfamilies is not NIL.


My mistake, fixed.


With this work the meaning of oprcanmerge (See pg_operator catalog and also
CREATE OPERATOR syntax) changes. Every btree operator can now be used to
perform a merge join. oprcanmerge however only indicates whether an operator is
an equality or not. Have you thought about that? Do we require to re-define
oprcanmerge?


For now we can test with old oprcanmerge meaning, not to bump the 
catalog version. Merge join needs only BTORDER_PROC function, which is 
required for btree opfamilies. This means that it should be always 
possible to merge join on operators that correspond to standard btree 
strategies. We could set oprcanmerge to true for all built-in btree 
comparison operators, and leave the possibility to disable it for custom 
operators.



I think, it should be possible to use this technique with more than one
inequality clauses as long as all the operators require the input to be ordered
in the same direction and the clauses are ANDed. In that case the for a given
outer tuple the matching inner tuples form a contiguous interval.


Consider a table "t(a int, b int)", the value of each column can be 1, 
2, 3, 4 and the table contains all possible combinations. If merge 
condition is "a < 2 and b < 2", for each of the four possible sorting 
directions, the result set won't be contiguous. Generally speaking, this 
happens when we have several groups with the same value of first column, 
and the first column matches the join condition. But inside each group, 
for some rows the second column doesn't match.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From c817ac1f93b83bcf43afac4af2dbaed37403a4a2 Mon Sep 17 00:00:00 2001
From: Alexander Kuzmenkov 
Date: Tue, 10 Apr 2018 12:31:21 +0300
Subject: [PATCH 2/2] Inequality merge join.

Perform merge joins on inequality clause. The current merge join
algorithm requires minimal modification to support one inequality clause
at the final position. This has performance benefits in some cases, and also
allows to perform full joins on inequality, which was not possible
before.
This commit modifies the merge join path generation logic and cost functions to
account for inequality clauses, and adds some tests.
---
 src/backend/executor/nodeMergejoin.c | 136 +--
 src/backend/optimizer/path/costsize.c|  27 ++-
 src/backend/optimizer/path/joinpath.c|  27 ++-
 src/backend/optimizer/path/pathkeys.c| 218 ---
 src/backend/optimizer/plan/initsplan.c   |  28 ++-
 src/backend/utils/adt/selfuncs.c |  40 +++--
 src/backend/utils/cache/lsyscache.c  |  40 +
 src/include/nodes/execnodes.h|   5 +
 src/include/optimizer/paths.h|   3 +-
 src/include/utils/lsyscache.h|   1 +
 src/test/regress/expected/join.out   | 250 +++
 src/test/regress/expected/partition_join.out |   4 +
 src/test/regress/sql/join.sql|  66 ++-
 src/test/regress/sql/partition_join.sql  |   5 +
 14 files changed, 750 insertions(+), 100 deletions(-)

diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 7298e1c..d6e5556 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -89,6 +89,45 @@
  *		proceed to another state.  This state is stored in the node's
  *		execution state information and is preserved across calls to
  *		ExecMergeJoin. -cim 10/31/89
+ *
+ * 		This algorithm can work almost as-is when the last join clause
+ * 		is an inequality. This introduces an additional restriction to
+ * 		the ordering of the inputs: when moving to the next outer tuple,
+ * 		the beginning of the matching interval of inner tuples must not
+ * 		change. For example, if the join operator is ">=", inputs must
+ * 		be in ascending order.
+ *
+ * 		Consider this example:
+ * 			outer  >= innner
+ * 1		0 - first match for outer = 1, 2
+ * 2		1 - last match for outer = 1
+ * 		2 - last match for outer = 2
+ *
+ * 		And if the inputs were sorted in descending order:
+ * 			outer  >= inner
+ * 2		2 - first match for outer = 2
+ * 1		1 - first match for outer = 1
+ * 		0 - last match for outer = 1, 2
+ *
+ * 		It can be seen that the beginning of the matching interval of
+ * 		inner tuples changes when we move to the next outer tuple.
+ * 		Supporting this, i.e. testing and advancing the marked tuple,
+ * 		would complicate the join algorithm. Instead of that, we have
+ * 		the planner ensure that the inputs are suitably ordered, and
+ * 		recheck this on 

Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-18 Thread Ashutosh Bapat
On Tue, Jul 10, 2018 at 10:06 PM, Alexander Kuzmenkov
 wrote:
> I tried to fix the things you mentioned and improve the comments. Among
> other changes, there is now a description of how merge join works with
> inequalities at the top of nodeMergejoin.c. It also explains why we only
> support one inequality clause.

Thanks for the commit messages. I would use word "in-equality" instead
of "comparison" since equality is also a comparison.

>
> Some particular points:
>
> On 07/06/2018 04:01 PM, Ashutosh Bapat wrote:
>>
>> -StrategyNumber opstrategy = mergestrategies[iClause];
>> +StrategyNumber sort_strategy = mergestrategies[iClause];
>> -intop_strategy;
>> +intjoin_strategy;
>> I don't see a reason why should we change the name of variable here. These
>> are
>> operator strategies and there's no need to change their names. The name
>> change
>> is introducing unnecessary diffs.
>
>
> These variables have different meaning but their names differ only with an
> underscore. When I had to change this function, I made mistakes because of
> this. I'd keep the descriptive names to avoid further confusion. Should this
> be a separate patch?

No, 0001 suffice. But I am still not sure that the variable name
change is worth the trouble. Anyway, will leave this for a committer
to judge.



>
> This is just a cross-check for the planner. Added a comment. We should
> probably use a separate error code for internal errors as opposed to user
> errors, but I'm not sure if we have one, I see just elog(ERROR) being used
> everywhere.

elog(ERROR) is fine. Thanks for the comments.

-if (op_mergejoinable(opno, exprType(leftarg)) &&
-!contain_volatile_functions((Node *) clause))
-restrictinfo->mergeopfamilies = get_mergejoin_opfamilies(opno);
+if (!contain_volatile_functions((Node *) clause))
+{
+if (op_mergejoinable_equality(opno, exprType(leftarg)))

Why is this condition split. Also why is the change in the order of conditions?

+{
+restrictinfo->mergeopfamilies =
get_btree_equality_opfamilies(opno);
+restrictinfo->is_mj_equality = true;

Comparing this with the original code, I think, is_mj_equality should be true
if restrictinfo->mergeopfamilies is not NIL. There is no way that a clause can
act as an equality clause when there are no families in which the operator is
an equality operator. If restrictinfo->mergeopfamilies can not be NIL here,
probably we should add an Assert and a bit of explanation as to why
is_mj_equality is true.

With this work the meaning of oprcanmerge (See pg_operator catalog and also
CREATE OPERATOR syntax) changes. Every btree operator can now be used to
perform a merge join. oprcanmerge however only indicates whether an operator is
an equality or not. Have you thought about that? Do we require to re-define
oprcanmerge?

+ *
+ * If the inequality clause is not the last one, or if there
are several
+ * of them, this algorithm doesn't work, because it is not possible to
+ * sort the inputs in such a way that given an outer tuple,
the matching
+ * inner tuples form a contiguous interval.

I think, it should be possible to use this technique with more than one
inequality clauses as long as all the operators require the input to be ordered
in the same direction and the clauses are ANDed. In that case the for a given
outer tuple the matching inner tuples form a contiguous interval.

I think it's better to straighten out these things before diving
further into the 2nd patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-10 Thread Alexander Kuzmenkov
I tried to fix the things you mentioned and improve the comments. Among 
other changes, there is now a description of how merge join works with 
inequalities at the top of nodeMergejoin.c. It also explains why we only 
support one inequality clause.


Some particular points:

On 07/06/2018 04:01 PM, Ashutosh Bapat wrote:

-StrategyNumber opstrategy = mergestrategies[iClause];
+StrategyNumber sort_strategy = mergestrategies[iClause];
-intop_strategy;
+intjoin_strategy;
I don't see a reason why should we change the name of variable here. These are
operator strategies and there's no need to change their names. The name change
is introducing unnecessary diffs.


These variables have different meaning but their names differ only with 
an underscore. When I had to change this function, I made mistakes 
because of this. I'd keep the descriptive names to avoid further 
confusion. Should this be a separate patch?



I think we should write a wrapper around MJCompare which interprets the result 
rather
than changing MJCompare itself. OR at least change the name of MJCompare.


Renamed the function to MJTestTuples to reflect that it decides whether 
we join tuples or advance either side.




- * for get_mergejoin_opfamilies().
+ * for get_equality_opfamilies().

I think we should leave mergejoin word in there or at least indicate that these
are btree opfamilies so that we don't confuse it with hash equality operator
families.


Renamed these to get_btree_equality_opfamilies() and 
get_btree_comparison_opfamilies().





+if (parent->mj_Ineq_Present)
+elog(ERROR, "inequality mergejoin clause must be the last one");
+

IIUC, this never happens. If it really happens, we have created a path which
can not be used practically. That should never happen. It will help to add a
comment here clarifying that situation.


This is just a cross-check for the planner. Added a comment. We should 
probably use a separate error code for internal errors as opposed to 
user errors, but I'm not sure if we have one, I see just elog(ERROR) 
being used everywhere.





+boolhave_inequality = have_inequality_mergeclause(mergeclauses);

There will be many paths created with different ordering of pathkeys. So,
instead of calling have_inequality_mergeclause() for each of those paths, it's
better to save this status in the path itself when creating the path.


I removed this function altogether, because we can just check the last 
merge clause. When we cost the path, we already have a proper 
mergejoinable list of clauses, so if there is an inequality clause, it's 
the last one.




  /* Make a pathkey list with this guy first */
  if (l != list_head(all_pathkeys))
+{
+if (have_inequality && l == list_tail(all_pathkeys))
+/* Inequality merge clause must be the last, we can't
move it */
+break;
+

I am kind of baffled by this change. IIUC the way we create different orderings
of pathkeys here, we are just rotating the pathkeys in circular order. This
means there is exactly one ordering of pathkeys where the pathkey corresponding
to the inequality clause is the last one.


This code does not rotate the pathkeys circularly, but puts each of them 
in the first position, and keeps the rest in the original order.
Say, if we have three equality pathkeys, and one inequality pathkey at 
the end (let's denote them as E1, E2, E3, IE), the permutations it tries 
will be like this:

E1 E2 E3 IE
E2 E1 E3 IE
E3 E1 E2 IE
Does this sound right?



  /* Might have no mergeclauses */
  if (nClauses == 0)
  return NIL;

+{
+List *ineq_clauses = find_inequality_clauses(mergeclauses);
+
+if (list_length(ineq_clauses) > 1)
+return NIL;

Without this patch, when there is an inequality clause with FULL JOIN, we will
not create a merge join path because select_mergejoin_clauses() will set
mergejoin_allowed to false. This means that we won't call
sort_inner_and_outer(). I think this patch also has to do the same i.e. when
there are more than one inequality clauses, select_mergejoin_clauses() should
set mergejoin_allowed to false in case of a FULL JOIN since merge join
machinary won't be able to handle that case.

If we do that, we could arrange extra.mergeclause_list such that the inequality
clause is always at the end thus finding inequality clause would be easy.


I changed select_mergejoin_clauses() to filter multiple inequality 
clauses and disable join if needed. Now we can use extra inequalities as 
join filter, if it's not full join. I didn't reorder 
extra.mergeclause_list there, because this order is ignored later. 
select_outer_pathkeys_for_merge() chooses the order of pathkeys using 
some heuristics, and then find_mergeclauses_for_outer_pathkeys() 
reorders the clauses accordingly.


--
Alexander Kuzmenkov
Postgres Professional: 

Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-10 Thread Ashutosh Bapat
On Tue, Jul 10, 2018 at 12:05 AM, Alexander Kuzmenkov
 wrote:
> On 07/09/2018 04:12 PM, Ashutosh Bapat wrote:
>>
>> On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
>>  wrote:
>>>
>>> I will continue reviewing the patches.
>>>
>> Here are some more review comments
>
>
>
> Ashutosh,
>
> Many thanks for the review, I'm glad that we are continuing with this patch.
> I'm working on your comments now, will post the updated version this week.

While updating the patches, please consider adding some comments as to
why only single inequality clause supported. I didn't see comments in
the patch explaining that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-09 Thread Alexander Kuzmenkov

On 07/09/2018 04:12 PM, Ashutosh Bapat wrote:

On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
 wrote:

I will continue reviewing the patches.


Here are some more review comments



Ashutosh,

Many thanks for the review, I'm glad that we are continuing with this 
patch. I'm working on your comments now, will post the updated version 
this week.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-09 Thread Ashutosh Bapat
On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
 wrote:
>
> I will continue reviewing the patches.
>

Here are some more review comments

- * sort ordering for each merge key.  The mergejoinable operator is an
- * equality operator in the opfamily, and the two inputs are guaranteed to be
+ * sort ordering for each merge key.  The mergejoinable operator is a
+ * comparison operator in the opfamily, and the two inputs are guaranteed to be

I think this prologue has to change substantially. At the beginning of the
prologue it explicitly mentions clauses like leftexpr = rightexpr. That
needs to be changed.

  * ordered in either increasing or decreasing (respectively) order according

It looks like the order of inputs is constrained by the in-equality operator.
That too needs to be specified here.

  * This allows us to obtain the needed comparison function from the opfamily.
@@ -200,6 +200,9 @@ MJExamineQuals(List *mergeclauses,
 Oidop_righttype;
 Oidsortfunc;

+if (parent->mj_Ineq_Present)
+elog(ERROR, "inequality mergejoin clause must be the last one");
+

IIUC, this never happens. If it really happens, we have created a path which
can not be used practically. That should never happen. It will help to add a
comment here clarifying that situation.

+boolhave_inequality = have_inequality_mergeclause(mergeclauses);

There will be many paths created with different ordering of pathkeys. So,
instead of calling have_inequality_mergeclause() for each of those paths, it's
better to save this status in the path itself when creating the path.

 /* Make a pathkey list with this guy first */
 if (l != list_head(all_pathkeys))
+{
+if (have_inequality && l == list_tail(all_pathkeys))
+/* Inequality merge clause must be the last, we can't
move it */
+break;
+

I am kind of baffled by this change. IIUC the way we create different orderings
of pathkeys here, we are just rotating the pathkeys in circular order. This
means there is exactly one ordering of pathkeys where the pathkey corresponding
to the inequality clause is the last one. It's only that ordering which will be
retained and all other ordering will be discarded. Instead of that, I think we
should keep the pathkey corresponding to the inequality clause at the end (or
track in separately) and create different orderings of pathkeys by rotating
other pathkeys. This will allow us to cost the orderings as intended by this
fucntion.

 /* Might have no mergeclauses */
 if (nClauses == 0)
 return NIL;

+{
+List *ineq_clauses = find_inequality_clauses(mergeclauses);
+
+if (list_length(ineq_clauses) > 1)
+return NIL;

Without this patch, when there is an inequality clause with FULL JOIN, we will
not create a merge join path because select_mergejoin_clauses() will set
mergejoin_allowed to false. This means that we won't call
sort_inner_and_outer(). I think this patch also has to do the same i.e. when
there are more than one inequality clauses, select_mergejoin_clauses() should
set mergejoin_allowed to false in case of a FULL JOIN since merge join
machinary won't be able to handle that case.

If we do that, we could arrange extra.mergeclause_list such that the inequality
clause is always at the end thus finding inequality clause would be easy.

Again, this is not full review, but I am diving deeper into the
patch-set and understanding it better. Sorry.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] PoC: full merge join on comparison clause

2018-07-06 Thread Ashutosh Bapat
Hi,
I have started reviewing these patches. I haven't grasped the design
yet. But here are some comments on the first patch.


-clauses = (MergeJoinClause) palloc0(nClauses *
sizeof(MergeJoinClauseData));
+parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses *
sizeof(MergeJoinClauseData));

crosses 80 characters.

-StrategyNumber opstrategy = mergestrategies[iClause];
+StrategyNumber sort_strategy = mergestrategies[iClause];
-intop_strategy;
+intjoin_strategy;
I don't see a reason why should we change the name of variable here. These are
operator strategies and there's no need to change their names. The name change
is introducing unnecessary diffs.

+clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args),
(PlanState *) parent);
+clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args),
(PlanState *) parent);

cross 80 characters.

 /*
@@ -378,20 +375,29 @@ MJEvalInnerValues(MergeJoinState *mergestate,
TupleTableSlot *innerslot)
 return result;
 }

+/* Tuple comparison result */
+typedef enum
+{
+MJCR_NextInner = 1,
+MJCR_NextOuter = -1,
+MJCR_Join = 0
+} MJCompareResult;
+
 /*
  * MJCompare
  *
- * Compare the mergejoinable values of the current two input tuples
- * and return 0 if they are equal (ie, the mergejoin equalities all
- * succeed), >0 if outer > inner, <0 if outer < inner.
+ * Compare the mergejoinable values of the current two input tuples.
+ * If they are equal, i.e., the mergejoin equalities all succeed,
+ * return MJCR_Join, if outer > inner, MJCR_NextInner, and else
+ * MJCR_NextOuter.
  *
  * MJEvalOuterValues and MJEvalInnerValues must already have been called
  * for the current outer and inner tuples, respectively.
  */
-static int
+static MJCompareResult
 MJCompare(MergeJoinState *mergestate)
 {

I am not sure about this change as well. MJCompare()'s job is to compare given
keys in the two tuples and return the comparison result. The result was used as
it is to decide which side to advance in an equality based merge join. But for
inequality based merge join the result needs to be interpreted further. I think
we should write a wrapper around MJCompare which interprets the result rather
than changing MJCompare itself. OR at least change the name of MJCompare. The
first option is better in case we use MJCompare for purposes other than merge
join in future. I am not sure what those could be, but say a merge based union
or something like that.

 /*
  * Sweep through the existing EquivalenceClasses looking for matches to
@@ -273,7 +274,7 @@ process_equivalence(PlannerInfo *root,
 /*
  * A "match" requires matching sets of btree opfamilies.  Use of
  * equal() for this test has implications discussed in the comments
- * for get_mergejoin_opfamilies().
+ * for get_equality_opfamilies().

I think we should leave mergejoin word in there or at least indicate that these
are btree opfamilies so that we don't confuse it with hash equality operator
families.

It will be good if you can write something about why these changes are
required in the file. If you are using git format-patch, you could
write a commit message that gets added to the patch. That way, it
leaves there for anybody to review.

I am having a difficult time reading the next patch. There are various
changes in the second patch, which I don't understand the reason
behind. I think some comments will help, in as commit message or in
the code.

I will continue reviewing the patches.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] PoC: full merge join on comparison clause

2018-03-06 Thread Alexander Kuzmenkov

On 05.03.2018 08:30, Ashutosh Bapat wrote:

But creating such patches using git format-patch (with -v as some suggest) 
really
helps in general.



Thanks for the advice. I heard about this workflow, but never used it 
myself. Perhaps it's time to try it.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] PoC: full merge join on comparison clause

2018-03-04 Thread Ashutosh Bapat
On Fri, Mar 2, 2018 at 8:02 PM, Alexander Kuzmenkov
 wrote:
> On 22.02.2018 21:42, Alexander Kuzmenkov wrote:
>>
>>
>> Some basic joins work, but I couldn't properly test all the corner cases
>> with different orderings, because they depend on a bug in vanilla merge
>> joins [1].
>>
>
> The bug was fixed, so here is the rebased patch. The planner part of the
> patch is stable now and can be reviewed, too.
>

Both the patches are named 01. Their names tell the order in which
they need to be applied, so it's ok for these patches. But creating
such patches using git format-patch (with -v as some suggest) really
helps in general. All you need to do is prepare commits in your
repository, one per patch, including changes in each patch in separate
commits and then run git format-patch on that repository. I use git
format-patch @{upstream}, but there are other ways also. Then you can
use git rebase to rebase your patches periodically. If you are already
doing that, sorry for the noise.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] PoC: full merge join on comparison clause

2018-03-02 Thread Alexander Kuzmenkov

On 22.02.2018 21:42, Alexander Kuzmenkov wrote:


Some basic joins work, but I couldn't properly test all the corner 
cases with different orderings, because they depend on a bug in 
vanilla merge joins [1].




The bug was fixed, so here is the rebased patch. The planner part of the 
patch is stable now and can be reviewed, too.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index f50205ec8a..861327b928 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -166,8 +166,8 @@ typedef enum
  * In addition to the expressions themselves, the planner passes the btree
  * opfamily OID, collation OID, btree strategy number (BTLessStrategyNumber or
  * BTGreaterStrategyNumber), and nulls-first flag that identify the intended
- * sort ordering for each merge key.  The mergejoinable operator is an
- * equality operator in the opfamily, and the two inputs are guaranteed to be
+ * sort ordering for each merge key.  The mergejoinable operator is a
+ * comparison operator in the opfamily, and the two inputs are guaranteed to be
  * ordered in either increasing or decreasing (respectively) order according
  * to the opfamily and collation, with nulls at the indicated end of the range.
  * This allows us to obtain the needed comparison function from the opfamily.
@@ -200,6 +200,9 @@ MJExamineQuals(List *mergeclauses,
 		Oid			op_righttype;
 		Oid			sortfunc;
 
+		if (parent->mj_Ineq_Present)
+			elog(ERROR, "inequality mergejoin clause must be the last one");
+
 		if (!IsA(qual, OpExpr))
 			elog(ERROR, "mergejoin clause is not an OpExpr");
 
@@ -225,9 +228,40 @@ MJExamineQuals(List *mergeclauses,
    _strategy,
    _lefttype,
    _righttype);
-		if (join_strategy != BTEqualStrategyNumber)	/* should not happen */
-			elog(ERROR, "cannot merge using non-equality operator %u",
- qual->opno);
+
+		/*
+		 * Determine whether we accept lesser and/or equal tuples of the inner
+		 * relation.
+		 */
+		if (join_strategy != BTEqualStrategyNumber)
+		{
+			parent->mj_Ineq_Present = true;
+			switch (join_strategy)
+			{
+case BTLessEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTLessStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTGreaterStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+case BTGreaterEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTGreaterStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTLessStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+default:
+	elog(ERROR, "unsupported join strategy %d", join_strategy);
+			}
+		}
 
 		/*
 		 * sortsupport routine must know if abbreviation optimization is
@@ -415,6 +449,19 @@ MJCompare(MergeJoinState *mergestate)
 	{
 		MergeJoinClause clause = >mj_Clauses[i];
 		int			sort_result;
+		bool join_equal = true;
+		bool join_lesser = false;
+
+		if (mergestate->mj_Ineq_Present && i == mergestate->mj_NumClauses - 1)
+		{
+			/*
+			 * If the last merge clause is an inequality, check whether
+			 * we have to join the inner tuples that are less than outer
+			 * and/or equal to outer.
+			 */
+			join_equal = mergestate->mj_Ineq_JoinEqual;
+			join_lesser = mergestate->mj_Ineq_JoinLesser;
+		}
 
 		/*
 		 * Special case for NULL-vs-NULL, else use standard comparison.
@@ -429,8 +476,22 @@ MJCompare(MergeJoinState *mergestate)
 		  clause->rdatum, clause->risnull,
 		  >ssup);
 
-		result = sort_result == 0 ? MJCR_Join
-	: sort_result < 0 ? MJCR_NextOuter : MJCR_NextInner;
+		if (sort_result < 0)
+			result = MJCR_NextOuter;
+		else if (sort_result == 0)
+		{
+			if (join_equal)
+result = MJCR_Join;
+			else
+result = MJCR_NextOuter;
+		}
+		else	/* sort_result > 0 */
+		{
+			if (join_lesser)
+result = MJCR_Join;
+			else
+result = MJCR_NextInner;
+		}
 
 		if (result != MJCR_Join)
 			break;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index d8db0b29e1..9d3f177622 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -2797,6 +2797,24 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 }
 
 /*
+ * Check whether there is an inequality clause in the list
+ */
+static bool
+have_inequality_mergeclause(List *mergeclauses)
+{
+	ListCell   *lc;
+
+	foreach(lc, mergeclauses)
+	{
+		RestrictInfo *rinfo = castNode(RestrictInfo, lfirst(lc));
+		Assert(rinfo->mergeopfamilies != NIL);
+		if (!rinfo->is_mj_equality)
+			return true;
+	}
+	return 

Re: [HACKERS] PoC: full merge join on comparison clause

2018-02-22 Thread Alexander Kuzmenkov

Here are some updates on this patch.

I split it into two parts. The preparatory part contains some mechanical 
changes to prepare for the main part. Most importantly, a new field is 
added, `RestrictInfo.is_mj_equality`. It is a marker of mergejoinable 
equality clauses, and `RestrictInfo.mergeopfamilies` is a more general 
marker of clauses that are mergejoinable but not necessarily equality. 
The usages are changed accordingly.


The main part consists of executor and planner changes required to 
support inequality merge joins.


The executor changes are as described in the original post.

The planner part has changed significantly since the last version. It 
used to apply some shady hacks to ensure we have the required sort 
orders of inner and outer paths. Now I think I found a reasonable way to 
generate the pathkeys we need. When we sort outer relation in 
`sort_inner_and_outer()`, the pathkeys are generated by 
`select_outer_pathkeys_for_merge()`. When we use the pathkeys we already 
have for the outer relation in `match_unsorted_outer()`, mergeclauses 
are selected by `find_mergeclauses_for_pathkeys()`. I changed these 
functions to select the right pathkey direction for merge clauses, and 
also ensure that we only have a single inequality merge clause and it is 
the last one. Also, to use the index paths, I changed 
`pathkeys_useful_for_merging()` to keep both pathkey directions for 
inequality merge clauses.


Some basic joins work, but I couldn't properly test all the corner cases 
with different orderings, because they depend on a bug in vanilla merge 
joins [1].


To sum up, the preparatory and executor changes are stable, and the 
planner part is WIP.


1. 
https://www.postgresql.org/message-id/flat/5dad9160-4632-0e47-e120-8e2082000...@postgrespro.ru


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index f50205ec8a..861327b928 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -166,8 +166,8 @@ typedef enum
  * In addition to the expressions themselves, the planner passes the btree
  * opfamily OID, collation OID, btree strategy number (BTLessStrategyNumber or
  * BTGreaterStrategyNumber), and nulls-first flag that identify the intended
- * sort ordering for each merge key.  The mergejoinable operator is an
- * equality operator in the opfamily, and the two inputs are guaranteed to be
+ * sort ordering for each merge key.  The mergejoinable operator is a
+ * comparison operator in the opfamily, and the two inputs are guaranteed to be
  * ordered in either increasing or decreasing (respectively) order according
  * to the opfamily and collation, with nulls at the indicated end of the range.
  * This allows us to obtain the needed comparison function from the opfamily.
@@ -200,6 +200,9 @@ MJExamineQuals(List *mergeclauses,
 		Oid			op_righttype;
 		Oid			sortfunc;
 
+		if (parent->mj_Ineq_Present)
+			elog(ERROR, "inequality mergejoin clause must be the last one");
+
 		if (!IsA(qual, OpExpr))
 			elog(ERROR, "mergejoin clause is not an OpExpr");
 
@@ -225,9 +228,40 @@ MJExamineQuals(List *mergeclauses,
    _strategy,
    _lefttype,
    _righttype);
-		if (join_strategy != BTEqualStrategyNumber)	/* should not happen */
-			elog(ERROR, "cannot merge using non-equality operator %u",
- qual->opno);
+
+		/*
+		 * Determine whether we accept lesser and/or equal tuples of the inner
+		 * relation.
+		 */
+		if (join_strategy != BTEqualStrategyNumber)
+		{
+			parent->mj_Ineq_Present = true;
+			switch (join_strategy)
+			{
+case BTLessEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTLessStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTGreaterStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+case BTGreaterEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTGreaterStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTLessStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+default:
+	elog(ERROR, "unsupported join strategy %d", join_strategy);
+			}
+		}
 
 		/*
 		 * sortsupport routine must know if abbreviation optimization is
@@ -415,6 +449,19 @@ MJCompare(MergeJoinState *mergestate)
 	{
 		MergeJoinClause clause = >mj_Clauses[i];
 		int			sort_result;
+		bool join_equal = true;
+		bool join_lesser = false;
+
+		if (mergestate->mj_Ineq_Present && i == mergestate->mj_NumClauses - 1)
+		{
+			/*
+			 * If the last merge clause is an inequality, check whether
+			 * we have to join the inner tuples that are less 

Re: [HACKERS] PoC: full merge join on comparison clause

2018-01-23 Thread Stephen Frost
Greetings Alexander,

* Alexander Kuzmenkov (a.kuzmen...@postgrespro.ru) wrote:
> Here is the patch rebased to a852cfe9.

Thanks for updating it.  This would definitely be nice to have.
Ashutosh, thanks for your previous review, would you have a chance to
look at it again?  Would be great to at least get this to ready for
committer before the end of this CF.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [HACKERS] PoC: full merge join on comparison clause

2017-12-04 Thread Alexander Kuzmenkov

Here is the patch rebased to a852cfe9.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index ef9e1ee471..c842ed2968 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -172,31 +172,32 @@ typedef enum
  * to the opfamily and collation, with nulls at the indicated end of the range.
  * This allows us to obtain the needed comparison function from the opfamily.
  */
-static MergeJoinClause
+static void
 MJExamineQuals(List *mergeclauses,
 			   Oid *mergefamilies,
 			   Oid *mergecollations,
 			   int *mergestrategies,
 			   bool *mergenullsfirst,
-			   PlanState *parent)
+			   MergeJoinState *parent)
 {
-	MergeJoinClause clauses;
 	int			nClauses = list_length(mergeclauses);
 	int			iClause;
 	ListCell   *cl;
 
-	clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData));
+	parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses * sizeof(MergeJoinClauseData));
+	parent->mj_UseEqual = (bool *) palloc0(nClauses * sizeof(bool));
+	parent->mj_UseLesser = (bool *) palloc0(nClauses * sizeof(bool));
 
 	iClause = 0;
 	foreach(cl, mergeclauses)
 	{
 		OpExpr	   *qual = (OpExpr *) lfirst(cl);
-		MergeJoinClause clause = [iClause];
+		MergeJoinClause clause = >mj_Clauses[iClause];
 		Oid			opfamily = mergefamilies[iClause];
 		Oid			collation = mergecollations[iClause];
-		StrategyNumber opstrategy = mergestrategies[iClause];
+		StrategyNumber sort_op_strategy = mergestrategies[iClause];
 		bool		nulls_first = mergenullsfirst[iClause];
-		int			op_strategy;
+		int			join_op_strategy;
 		Oid			op_lefttype;
 		Oid			op_righttype;
 		Oid			sortfunc;
@@ -207,28 +208,55 @@ MJExamineQuals(List *mergeclauses,
 		/*
 		 * Prepare the input expressions for execution.
 		 */
-		clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), parent);
-		clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), parent);
+		clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args), (PlanState *) parent);
+		clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args), (PlanState *) parent);
 
 		/* Set up sort support data */
 		clause->ssup.ssup_cxt = CurrentMemoryContext;
 		clause->ssup.ssup_collation = collation;
-		if (opstrategy == BTLessStrategyNumber)
+		if (sort_op_strategy == BTLessStrategyNumber)
 			clause->ssup.ssup_reverse = false;
-		else if (opstrategy == BTGreaterStrategyNumber)
+		else if (sort_op_strategy == BTGreaterStrategyNumber)
 			clause->ssup.ssup_reverse = true;
 		else	/* planner screwed up */
-			elog(ERROR, "unsupported mergejoin strategy %d", opstrategy);
+			elog(ERROR, "unsupported mergejoin strategy %d", sort_op_strategy);
 		clause->ssup.ssup_nulls_first = nulls_first;
 
 		/* Extract the operator's declared left/right datatypes */
 		get_op_opfamily_properties(qual->opno, opfamily, false,
-   _strategy,
+   _op_strategy,
    _lefttype,
    _righttype);
-		if (op_strategy != BTEqualStrategyNumber)	/* should not happen */
-			elog(ERROR, "cannot merge using non-equality operator %u",
- qual->opno);
+
+		/*
+		 * Determine whether we accept lesser and/or equal tuples of the inner
+		 * relation.
+		 */
+		switch (join_op_strategy)
+		{
+			case BTEqualStrategyNumber:
+parent->mj_UseEqual[iClause] = true;
+break;
+
+			case BTLessEqualStrategyNumber:
+parent->mj_UseEqual[iClause] = true;
+/* fall through */
+
+			case BTLessStrategyNumber:
+parent->mj_UseLesser[iClause] = true;
+break;
+
+			case BTGreaterEqualStrategyNumber:
+parent->mj_UseEqual[iClause] = true;
+/* fall through */
+
+			case BTGreaterStrategyNumber:
+parent->mj_UseLesser[iClause] = true;
+break;
+
+			default:
+elog(ERROR, "unsupported join strategy %d", join_op_strategy);
+		}
 
 		/*
 		 * sortsupport routine must know if abbreviation optimization is
@@ -265,8 +293,6 @@ MJExamineQuals(List *mergeclauses,
 
 		iClause++;
 	}
-
-	return clauses;
 }
 
 /*
@@ -378,6 +404,14 @@ MJEvalInnerValues(MergeJoinState *mergestate, TupleTableSlot *innerslot)
 	return result;
 }
 
+/* Tuple comparison result */
+typedef enum
+{
+	MJCR_NextInner = 1,
+	MJCR_NextOuter = -1,
+	MJCR_Join = 0
+} MJCompareResult;
+
 /*
  * MJCompare
  *
@@ -388,10 +422,10 @@ MJEvalInnerValues(MergeJoinState *mergestate, TupleTableSlot *innerslot)
  * MJEvalOuterValues and MJEvalInnerValues must already have been called
  * for the current outer and inner tuples, respectively.
  */
-static int
+static MJCompareResult
 MJCompare(MergeJoinState *mergestate)
 {
-	int			result = 0;
+	MJCompareResult result = MJCR_Join;
 	bool		nulleqnull = false;
 	ExprContext *econtext = mergestate->js.ps.ps_ExprContext;
 	int			i;
@@ -408,6 +442,7 @@ MJCompare(MergeJoinState *mergestate)
 	for (i = 0; i < mergestate->mj_NumClauses; i++)
 	{
 		MergeJoinClause clause = >mj_Clauses[i];
+		int