Re: [HACKERS] Range Merge Join v1
On Thu, Aug 31, 2017 at 1:52 AM, Jeff Daviswrote: > Updated patch attached. Changelog: > > * Rebased > * Changed MJCompare to return an enum as suggested, but it has 4 > possible values rather than 3. > * Added support for joining on contains or contained by (@> or <@) and > updated tests. The current patch does not work well with multiple keys, and I believe it's important to solve because it will make it usable for multi-dimension spatial joins. The problem is this: the algorithm for a single key demands that the input is sorted by (lower(r) NULLS FIRST, upper(r) NULLS LAST). That's easy, because that's also the sort order for the range operator class, so everything just works. For multiple keys, the order of the input is: lower(r1) NULLS FIRST, lower(r2) NULLS FIRST, upper(r1) NULLS LAST, upper(r2) NULLS LAST But that can't match up with any opclass, because an opclass can only order one attribute at a time. In this case, the lower bound of r2 is more significant than the upper bound of r1. It's easy enough to adapt the execution to make this work as long as the input is properly sorted. The challenge is making the optimizer choose the sort keys properly. I have tried a few approaches. The current approach that I'm using is: * have a new, special range operator family with no operator classes * in check_mergejoinable(), detect the && operator and set the mergeopfamilies to contain only the special operator family * in try_mergejoin_path(), it will convert the pathkeys using that operator class into pathkeys over a "lower" expression over the same EC; and then add on additional pathkeys for the "upper" expressions onto the end of the pathkey list Any comments or alternative suggestions welcome. This will probably take a few days at least, so I put the patch in "waiting on author" state. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
On Fri, Aug 25, 2017 at 10:19 AM, Alexander Kuzmenkovwrote: > Hi Jeff, Hi, Thank you for the review and suggestions! > * At the moment, "mergejoinable clause" and "equality clause" mean the same > thing to the planner, and those clauses are used both to create equivalence > classes and to perform merge joins. If we start performing merge joins for > different kinds of clauses, such as comparison or range intersection, it > makes sense to separate these two meanings. I tried this in my patch and it > didn't require many changes. I use RestrictInfo.equivopfamilies to build > equivalence classes, and RestrictInfo.mergeopfamilies for mergejoinable > clauses. I like the idea. I will look in more detail and I can either change my patch or piggyback on yours. > * Semantically, MJCompare() is a function that determines whether you should > emit a join result or advance one of the pointers. This meaning is not > explicit in the code and is not well encapsulated: the function returns and > int and 'compareNoMatch' flag, and the calling code combines them in various > ways to derive the final result. This meaning can be made explicit by making > MJCompare return enum values {Join, NextInner, NextOuter}, and putting > inside it all the logic that decides whether we join or advance. > ExecMergeJoin looks intimidating already, and I think this change would help > readability. Again, you can find an illustration in my patch. I actually tried doing something similar in my patch, but there are four cases to consider in EXEC_MJ_SKIP_TEST: 1. Overlaps: mark and then join the tuples 2. left-of: SKIPOUTER_ADVANCE 3. right-of: SKIPINNER_ADVANCE 4. None of the above: mark and NEXTINNER However, you are right that the current code is ugly, and I should refactor into 4 explicit cases. I don't think I will call them by the same names as the join states, though, because there's not a direct mapping. Updated patch attached. Changelog: * Rebased * Changed MJCompare to return an enum as suggested, but it has 4 possible values rather than 3. * Added support for joining on contains or contained by (@> or <@) and updated tests. Regards, Jeff Davis diff --git a/doc/src/sgml/rangetypes.sgml b/doc/src/sgml/rangetypes.sgml index 9557c16..84578a7 100644 *** a/doc/src/sgml/rangetypes.sgml --- b/doc/src/sgml/rangetypes.sgml *** *** 522,525 INSERT 0 1 --- 522,595 + + Range Join + + + range type + join + + + + A Range Join is a join where one of the join conditions is the range + overlaps operator (see ). In other + words, rather than returning rows where corresponding fields are equal, it + returns rows where the corresponding fields overlap. + + + For instance, to find users who were active on a website at the same time + as they were using a mobile app, you might issue a query like: + + CREATE TABLE web_session( + web_session_id text primary key, + username text, + during tstzrange); + CREATE TABLE app_session( + app_session_id text primary key, + username text, + during tstzrange); + INSERT INTO web_session VALUES + ('0cc175b9c0f1b6a8', 'user1', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('92eb5ffee6ae2fec', 'user2', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('4a8a08f09d37b737', 'user3', '[2017-02-01 09:30, 2017-02-01 10:45)'); + INSERT INTO app_session VALUES + ('8277e0910d750195', 'user1', '[2017-02-01 10:30, 2017-02-01 11:45)'), + ('b448797616e091ad', 'user3', '[2017-02-01 09:00, 2017-02-01 11:00)'), + ('95649038408b5f33', 'user4', '[2017-02-01 09:30, 2017-02-01 10:45)'); + + SELECT w.username, +w.during * a.during AS both_during + FROM web_session w, app_session a + WHERE w.username = a.username + AND w.during && a.during; + username | both_during + --+- + user1| ["2017-02-01 10:30:00-08","2017-02-01 10:45:00-08") + user3| ["2017-02-01 09:30:00-08","2017-02-01 10:45:00-08") + (2 rows) + + + + In addition to the general execution strategies available, Postgres also + has special support for a variant of Merge Join called Range Merge Join: + + EXPLAIN (costs off) SELECT w.username, +w.during * a.during AS both_during + FROM web_session w, app_session a + WHERE w.username = a.username + AND w.during && a.during; + QUERY PLAN + -- + Range Merge Join +Merge Cond: ((w.during && a.during) AND (w.username = a.username)) +-> Sort + Sort Key: w.during, w.username + -> Seq Scan on web_session w +-> Sort + Sort Key: a.during, a.username + -> Seq Scan on app_session a + (8 rows) + + + diff --git
Re: [HACKERS] Range Merge Join v1
Hi Jeff, Fast range joins are very nice to have, thank you for working on this. I've been doing a somewhat related thing recently, trying to teach the merge join executor to perform full joins on comparison clauses [1]. I have some comments after reading your patch: * At the moment, "mergejoinable clause" and "equality clause" mean the same thing to the planner, and those clauses are used both to create equivalence classes and to perform merge joins. If we start performing merge joins for different kinds of clauses, such as comparison or range intersection, it makes sense to separate these two meanings. I tried this in my patch and it didn't require many changes. I use RestrictInfo.equivopfamilies to build equivalence classes, and RestrictInfo.mergeopfamilies for mergejoinable clauses. * Semantically, MJCompare() is a function that determines whether you should emit a join result or advance one of the pointers. This meaning is not explicit in the code and is not well encapsulated: the function returns and int and 'compareNoMatch' flag, and the calling code combines them in various ways to derive the final result. This meaning can be made explicit by making MJCompare return enum values {Join, NextInner, NextOuter}, and putting inside it all the logic that decides whether we join or advance. ExecMergeJoin looks intimidating already, and I think this change would help readability. Again, you can find an illustration in my patch. I hope you find these points helpful. [1] https://www.postgresql.org/message-id/flat/1d23ad41-a9d9-1d1e-1d79-83b002d6f776%40postgrespro.ru#1d23ad41-a9d9-1d1e-1d79-83b002d6f...@postgrespro.ru -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
2017-06-02 19:42 GMT+05:00 Jeff Davis: > On Tue, May 30, 2017 at 11:44 PM, Andrew Borodin wrote: >> 1. Are there any types, which could benefit from Range Merge and are >> not covered by this patch? > > I thought about this for a while, and the only thing I can think of > are range joins that don't explicitly use range types. Let me try to write && in SQL select * from a join b where (a.min<=b.max and a.min>=b.min) or (a.max<=b.max and a.max>=b.min); Quite complicated. Here user knows that min <= max, but DB don't. If we write it precisely we get hell lot of permutations. Here's what I think: 1. For me, this feature seems hard to implement. 2. This feature also will be hard to commit solely, since it's use case will be rare. 3. If this could yield unexpected performance for queries like select * from t where xc or a!=z or [other conditions] and optimizer could think: "Aha! I'll sort it and do it fast" it'd be cool. I do not think range joins that don't explicitly use range types are possible right now... >> 2. Can Range Merge handle merge of different ranges? Like int4range() >> && int8range() ? > > Right now, there aren't even casts between range types. I think the > best way to handle that at this point would be to add casts among the > numeric ranges. There may be advantages to supporting any two ranges > where the contained types are part of the same opfamily, but it seems > a little early to add that complication. Agree. > I think there are a couple more things that could be done if we want > to. Let me know if you think these things should be done now, or if > they should be a separate patch later when the need arises: > > * Support for r1 @> r2 joins (join on "contains" rather than "overlaps"). > * Better integration with the catalog so that users could add their > own types that support range merge join. 1. I believe this changes can be incremental. They constitute value for the end user, then they are committable. This patch is not overly complicated, but it's easier to do this stuff in small pieces. 2. The commitfest is 3 months away. If you will have new versions of the patch, I'll review again. Maybe will spot some new things :) Best regards, Andrey Borodin, Octonica. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
On Tue, May 30, 2017 at 11:44 PM, Andrew Borodinwrote: > Hi, Jeff! Hi! > Sorry for being late. Actually, I had several unsuccessful attempts to > find something wrong with the patch. > Here's my review. > > in pathkey.c > > ecs = (EquivalenceClass **) palloc(nClauses * sizeof(EquivalenceClass *)); > scores = (int *) palloc(nClauses * sizeof(int)); > range_ecs = palloc(nClauses * sizeof(bool)); > > Third assignment has no cast. Will fix. > And I have few questions: > 1. Are there any types, which could benefit from Range Merge and are > not covered by this patch? I thought about this for a while, and the only thing I can think of are range joins that don't explicitly use range types. > 2. Can Range Merge handle merge of different ranges? Like int4range() > && int8range() ? Right now, there aren't even casts between range types. I think the best way to handle that at this point would be to add casts among the numeric ranges. There may be advantages to supporting any two ranges where the contained types are part of the same opfamily, but it seems a little early to add that complication. > My perf test script from the previous message was broken, here's fixed > one in the attachment. > > This patch implements feature, contains new tests and passes old > tests, is documented and spec compliant. I do not see any reason why > not mark it "Ready for committer". Great! I think there are a couple more things that could be done if we want to. Let me know if you think these things should be done now, or if they should be a separate patch later when the need arises: * Support for r1 @> r2 joins (join on "contains" rather than "overlaps"). * Better integration with the catalog so that users could add their own types that support range merge join. Thank you for the review. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
Hi, Jeff! Sorry for being late. Actually, I had several unsuccessful attempts to find something wrong with the patch. Here's my review. in pathkey.c ecs = (EquivalenceClass **) palloc(nClauses * sizeof(EquivalenceClass *)); scores = (int *) palloc(nClauses * sizeof(int)); range_ecs = palloc(nClauses * sizeof(bool)); Third assignment has no cast. And I have few questions: 1. Are there any types, which could benefit from Range Merge and are not covered by this patch? 2. Can Range Merge handle merge of different ranges? Like int4range() && int8range() ? My perf test script from the previous message was broken, here's fixed one in the attachment. This patch implements feature, contains new tests and passes old tests, is documented and spec compliant. I do not see any reason why not mark it "Ready for committer". Best regards, Andrey Borodin. \timing create table r as select int4range(g, g+10) ir, g g from generate_series(1,100) g order by random(); create index r_idx on r using gist (ir); create table s as select int4range(g+5, g+15) ir,g g from generate_series(1,100) g order by random(); create index s_idx on s using gist (ir); vacuum analyze r; vacuum analyze s; --baseline performance using GiST set enable_mergejoin=false; explain analyze select * from r, s where r.ir && s.ir; explain analyze select * from r, s where r.ir && s.ir; --performance without GiST set enable_mergejoin=true; explain analyze select * from r, s where r.ir && s.ir; explain analyze select * from r, s where r.ir && s.ir; --performance in presence of expression index create index r_idx1 on r(int4range(r.g, r.g+10)); create index s_idx1 on s(int4range(s.g+5, s.g+15)); explain analyze select * from r, s where int4range(r.g, r.g+10) && int4range(s.g+5, s.g+15); explain analyze select * from r, s where int4range(r.g, r.g+10) && int4range(s.g+5, s.g+15); --performance in precence of direct B-tree index create index r_idx2 on r(ir); create index s_idx2 on s(ir); explain analyze select * from r, s where r.ir && s.ir; explain analyze select * from r, s where r.ir && s.ir; drop table r; drop table s; --here we test that performance is not affected by payload of other attributes in heap tuples create table r as select int4range(g, g+10) ir, g g, 1::float pl1,1::float pl2,1::float pl3,1::float pl4,1::float pl5,1::float pl6,1::float pl7,1::float pl8,1::float pl9,1::float pl0 from generate_series(1,100) g order by random(); create table s as select int4range(g+5, g+15) ir,g g, 1::float pl1,1::float pl2,1::float pl3,1::float pl4,1::float pl5,1::float pl6,1::float pl7,1::float pl8,1::float pl9,1::float pl0 from generate_series(1,100) g order by random(); explain analyze select r.ir,s.ir from r, s where r.ir && s.ir; explain analyze select r.ir,s.ir from r, s where r.ir && s.ir; drop table r; drop table s; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
Hi, Jeff! I'm planning to provide the review for this patch in future. I've done some performance tesing (see attachment). All in all, nothing important, everything works fine. Tests were executed under current HEAD on Ubuntu 16 over MacBook Air. I observe that: 1. Patch brings performance improvement for specified query 2. Performance improvement of Range Merge Join vs GiST Nested Loop Join (on Index Only Scan) is between 4x and 6x 3. Performance improvement of RMJ with B-tree index vs GiST is between 4x and 10x (due to lack of actually competing cases, here I omit T-tests, they are not that important when speaking about 4x difference) 4. Range Merge Join is capable to consume expressional index with exactly same effect as direct index 5. Other attributes residing in joined tables do not affect performance improvement I'll make code review some time later, probably next week. (I hope there is no urge in this, since commitfest is in July, or is it urgent?) BTW fixe typo in index specification of original performance test (both indexes were for same table) Best regards, Andrey Borodin. perf2.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Merge Join v1
On Thu, Apr 20, 2017 at 2:05 AM, Rafia Sabihwrote: > Looks like an interesting idea, however, in an attempt to test this patch I > found following error when compiling, > selfuncs.c: In function ‘mergejoinscansel’: > selfuncs.c:2901:12: error: ‘op_strategy’ undeclared (first use in this > function) >_strategy, > Looks like filterdiff destroyed my patch from git. Attaching unified version against master 3820c63d. Thanks! Jeff Davis diff --git a/doc/src/sgml/rangetypes.sgml b/doc/src/sgml/rangetypes.sgml index 9557c16..84578a7 100644 --- a/doc/src/sgml/rangetypes.sgml +++ b/doc/src/sgml/rangetypes.sgml @@ -522,4 +522,74 @@ INSERT 0 1 + + Range Join + + +range type +join + + + +A Range Join is a join where one of the join conditions is the range +overlaps operator (see ). In other +words, rather than returning rows where corresponding fields are equal, it +returns rows where the corresponding fields overlap. + + +For instance, to find users who were active on a website at the same time +as they were using a mobile app, you might issue a query like: + +CREATE TABLE web_session( +web_session_id text primary key, +username text, +during tstzrange); +CREATE TABLE app_session( +app_session_id text primary key, +username text, +during tstzrange); +INSERT INTO web_session VALUES +('0cc175b9c0f1b6a8', 'user1', '[2017-02-01 09:30, 2017-02-01 10:45)'), +('92eb5ffee6ae2fec', 'user2', '[2017-02-01 09:30, 2017-02-01 10:45)'), +('4a8a08f09d37b737', 'user3', '[2017-02-01 09:30, 2017-02-01 10:45)'); +INSERT INTO app_session VALUES +('8277e0910d750195', 'user1', '[2017-02-01 10:30, 2017-02-01 11:45)'), +('b448797616e091ad', 'user3', '[2017-02-01 09:00, 2017-02-01 11:00)'), +('95649038408b5f33', 'user4', '[2017-02-01 09:30, 2017-02-01 10:45)'); + +SELECT w.username, + w.during * a.during AS both_during +FROM web_session w, app_session a +WHERE w.username = a.username +AND w.during && a.during; + username | both_during +--+- + user1| ["2017-02-01 10:30:00-08","2017-02-01 10:45:00-08") + user3| ["2017-02-01 09:30:00-08","2017-02-01 10:45:00-08") +(2 rows) + + + +In addition to the general execution strategies available, Postgres also +has special support for a variant of Merge Join called Range Merge Join: + + EXPLAIN (costs off) SELECT w.username, + w.during * a.during AS both_during +FROM web_session w, app_session a +WHERE w.username = a.username +AND w.during && a.during; + QUERY PLAN +-- + Range Merge Join + Merge Cond: ((w.during && a.during) AND (w.username = a.username)) + -> Sort + Sort Key: w.during, w.username + -> Seq Scan on web_session w + -> Sort + Sort Key: a.during, a.username + -> Seq Scan on app_session a +(8 rows) + + + diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 9359d0a..1010668 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -909,7 +909,11 @@ ExplainNode(PlanState *planstate, List *ancestors, pname = sname = "Nested Loop"; break; case T_MergeJoin: - pname = "Merge"; /* "Join" gets added by jointype switch */ + if (((MergeJoin*)plan)->mergeRangeJoin) +pname = "Range Merge"; /* "Join" gets added by jointype switch */ + else +pname = "Merge"; /* "Join" gets added by jointype switch */ + sname = "Merge Join"; break; case T_HashJoin: diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c index 8c483bf..bd312e6 100644 --- a/src/backend/executor/nodeMergejoin.c +++ b/src/backend/executor/nodeMergejoin.c @@ -89,14 +89,67 @@ * proceed to another state. This state is stored in the node's * execution state information and is preserved across calls to * ExecMergeJoin. -cim 10/31/89 + * + * RANGE MERGE JOIN + * + * Range Merge Join is a generalization of merge join. When a join + * predicate involves the range overlaps operator (&&), a merge join + * becomes a range join. + * + * The input ranges must be ordered by (lower-bound, upper-bound), which + * is the ordinary range_ops operator class. MJCompare must not simply + * use the operator class's comparison semantics though; instead it + * follows these rules: + * + * - return 0 if the outer range overlaps the inner range + * - return <0 if the outer range is strictly left-of the inner range + * - return >0 if the outer range is strictly right-of the inner range + * + * NB: the above is a simplification considering only a single merge + * clause. When there are multiple merge clauses, it's possible that one + * tuple is
Re: [HACKERS] Range Merge Join v1
On Thu, Apr 6, 2017 at 2:13 PM, Jeff Daviswrote: > > Example: > > > Find different people using the same website at the same time: > > create table session(sessionid text, username text, during tstzrange); > SELECT s1.username, s2.username, s1.during * s2.during > FROM session s1, session s2 > WHERE s1.during && s2.during AND s1.username < s2.username > > > = > Brief summary of previous discussion: > = > > http://www.postgresql.org/message-id/1334554850.10878.52.camel@jdavis > > - can indexes solve it already (parameterized paths, etc.)? > - planner complexity (track path keys, mergejoinable equality, etc.) > - spatial join algorithm choice > - compare range merge join against existing indexes to see if any > indexing approach is viable > > == > Externals: > == > > No new syntax or other externals. Just improved execution strategy for > joining ranges using the overlaps operator (&&). > > New syntax is possible, but I don't see a strong reason to support > special range join syntax at this time. > > = > Optimizer Design > = > > I took the path of least resistance, so if I am doing something wrong > please let me know. I hardwired it to look for the range overlaps > operator when checking if it could do a merge join, and then add > range_ops to restrictinfo->mergeopfamilies. > > Then, I have to suppress adding it as an equivalence class, because > overlaps is not equality. It still adds single-member ECs to > restrictinfo->right_ec and left_ec, but (I think) that's OK. > > Also, I have to prevent generating paths for right/full join, because > the range join algorithm can't (currently) handle those. > > = > Costing > = > > Needs more consideration. Seems to do reasonable things in the few > examples I tried. > > = > Executor Design > = > > See detailed comments in nodeMergejoin.c > > = > Performance > = > > Seems much better than index nest loop join when the join is not > selective. I will post detailed numbers soon. > > If no index is available, range merge join is the only reasonable way > to execute a range join. For instance, if the inner is not a leaf in > the plan. > > = > Alternatives: > = > > It was suggested that I approach it as a general spatial-join > problem. That has merit, but I rejected it for now because the > algorithm that I think has the most promise is range-oriented > anyway. By that I mean we would need to extract all of the dimensions > into ranges first, and then perform the join. With that in mind, it > makes sense to implement range joins first; and then later provide the > APIs to get at the spatial dimensions so that we can implement other > spatial joins as range joins. > > Another suggestion was to base it on hash join, but I never understood > that proposal and it didn't seem to map very well to a spatial join. > > Yet another suggestion was to use some kind of temporary index. Some > brief experiments I did indicated that it would be fairly slow (though > more investigation might be useful here). Also, it doesn't provide any > alternative to the nestloop-with-inner-index we already offer at the > leaf level today. > > Regards, > Jeff Davis > Looks like an interesting idea, however, in an attempt to test this patch I found following error when compiling, selfuncs.c: In function ‘mergejoinscansel’: selfuncs.c:2901:12: error: ‘op_strategy’ undeclared (first use in this function) _strategy, -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/
Re: [HACKERS] Range Merge Join v1
On Tue, Apr 11, 2017 at 12:17 AM, Jeff Daviswrote: > Version 2 attached. Fixed a few issues, expanded tests, added docs. It looks like the CF app only listed my perf test script. Re-attaching rangejoin-v2.patch so that it appears in the CF app. Identical to other rangejoin-v2.patch. Regards, Jeff Davis diff --git a/doc/src/sgml/rangetypes.sgml b/doc/src/sgml/rangetypes.sgml index 9557c16..84578a7 100644 *** a/doc/src/sgml/rangetypes.sgml --- b/doc/src/sgml/rangetypes.sgml *** *** 522,525 INSERT 0 1 --- 522,595 + + Range Join + + + range type + join + + + + A Range Join is a join where one of the join conditions is the range + overlaps operator (see ). In other + words, rather than returning rows where corresponding fields are equal, it + returns rows where the corresponding fields overlap. + + + For instance, to find users who were active on a website at the same time + as they were using a mobile app, you might issue a query like: + + CREATE TABLE web_session( + web_session_id text primary key, + username text, + during tstzrange); + CREATE TABLE app_session( + app_session_id text primary key, + username text, + during tstzrange); + INSERT INTO web_session VALUES + ('0cc175b9c0f1b6a8', 'user1', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('92eb5ffee6ae2fec', 'user2', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('4a8a08f09d37b737', 'user3', '[2017-02-01 09:30, 2017-02-01 10:45)'); + INSERT INTO app_session VALUES + ('8277e0910d750195', 'user1', '[2017-02-01 10:30, 2017-02-01 11:45)'), + ('b448797616e091ad', 'user3', '[2017-02-01 09:00, 2017-02-01 11:00)'), + ('95649038408b5f33', 'user4', '[2017-02-01 09:30, 2017-02-01 10:45)'); + + SELECT w.username, +w.during * a.during AS both_during + FROM web_session w, app_session a + WHERE w.username = a.username + AND w.during && a.during; + username | both_during + --+- + user1| ["2017-02-01 10:30:00-08","2017-02-01 10:45:00-08") + user3| ["2017-02-01 09:30:00-08","2017-02-01 10:45:00-08") + (2 rows) + + + + In addition to the general execution strategies available, Postgres also + has special support for a variant of Merge Join called Range Merge Join: + + EXPLAIN (costs off) SELECT w.username, +w.during * a.during AS both_during + FROM web_session w, app_session a + WHERE w.username = a.username + AND w.during && a.during; + QUERY PLAN + -- + Range Merge Join +Merge Cond: ((w.during && a.during) AND (w.username = a.username)) +-> Sort + Sort Key: w.during, w.username + -> Seq Scan on web_session w +-> Sort + Sort Key: a.during, a.username + -> Seq Scan on app_session a + (8 rows) + + + diff --git a/src/backend/commands/eindex 9359d0a..1010668 100644 *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *** *** 909,915 ExplainNode(PlanState *planstate, List *ancestors, pname = sname = "Nested Loop"; break; case T_MergeJoin: ! pname = "Merge"; /* "Join" gets added by jointype switch */ sname = "Merge Join"; break; case T_HashJoin: --- 909,919 pname = sname = "Nested Loop"; break; case T_MergeJoin: ! if (((MergeJoin*)plan)->mergeRangeJoin) ! pname = "Range Merge"; /* "Join" gets added by jointype switch */ ! else ! pname = "Merge"; /* "Join" gets added by jointype switch */ ! sname = "Merge Join"; break; case T_HashJoin: diff --git a/src/backend/executor/nodindex 8c483bf..bd312e6 100644 *** a/src/backend/executor/nodeMergejoin.c --- b/src/backend/executor/nodeMergejoin.c *** *** 89,102 --- 89,155 * proceed to another state. This state is stored in the node's * execution state information and is preserved across calls to * ExecMergeJoin. -cim 10/31/89 + * + * RANGE MERGE JOIN + * + * Range Merge Join is a generalization of merge join. When a join + * predicate involves the range overlaps operator (&&), a merge join + * becomes a range join. + * + * The input ranges must be ordered by (lower-bound, upper-bound), which + * is the ordinary range_ops operator class. MJCompare must not simply + * use the operator class's comparison semantics though; instead it + * follows these rules: + * + * - return 0 if the outer range overlaps the inner range + * - return <0 if the outer range is strictly left-of the inner range + * - return >0 if the outer range is strictly right-of the inner range + * + * NB: the above is a simplification considering only a single merge + * clause.
Re: [HACKERS] Range Merge Join v1
Version 2 attached. Fixed a few issues, expanded tests, added docs. A simple performance test (script attached) shows about a 5X improvement when comparing against a nested loop with an inner index-only scan over a gist index. Even better, this doesn't require an index, so it will work even if the input relations are subqueries. Regards, Jeff Davis On Thu, Apr 6, 2017 at 1:43 AM, Jeff Daviswrote: > > Example: > > > Find different people using the same website at the same time: > > create table session(sessionid text, username text, during tstzrange); > SELECT s1.username, s2.username, s1.during * s2.during > FROM session s1, session s2 > WHERE s1.during && s2.during AND s1.username < s2.username > > > = > Brief summary of previous discussion: > = > > http://www.postgresql.org/message-id/1334554850.10878.52.camel@jdavis > > - can indexes solve it already (parameterized paths, etc.)? > - planner complexity (track path keys, mergejoinable equality, etc.) > - spatial join algorithm choice > - compare range merge join against existing indexes to see if any > indexing approach is viable > > == > Externals: > == > > No new syntax or other externals. Just improved execution strategy for > joining ranges using the overlaps operator (&&). > > New syntax is possible, but I don't see a strong reason to support > special range join syntax at this time. > > = > Optimizer Design > = > > I took the path of least resistance, so if I am doing something wrong > please let me know. I hardwired it to look for the range overlaps > operator when checking if it could do a merge join, and then add > range_ops to restrictinfo->mergeopfamilies. > > Then, I have to suppress adding it as an equivalence class, because > overlaps is not equality. It still adds single-member ECs to > restrictinfo->right_ec and left_ec, but (I think) that's OK. > > Also, I have to prevent generating paths for right/full join, because > the range join algorithm can't (currently) handle those. > > = > Costing > = > > Needs more consideration. Seems to do reasonable things in the few > examples I tried. > > = > Executor Design > = > > See detailed comments in nodeMergejoin.c > > = > Performance > = > > Seems much better than index nest loop join when the join is not > selective. I will post detailed numbers soon. > > If no index is available, range merge join is the only reasonable way > to execute a range join. For instance, if the inner is not a leaf in > the plan. > > = > Alternatives: > = > > It was suggested that I approach it as a general spatial-join > problem. That has merit, but I rejected it for now because the > algorithm that I think has the most promise is range-oriented > anyway. By that I mean we would need to extract all of the dimensions > into ranges first, and then perform the join. With that in mind, it > makes sense to implement range joins first; and then later provide the > APIs to get at the spatial dimensions so that we can implement other > spatial joins as range joins. > > Another suggestion was to base it on hash join, but I never understood > that proposal and it didn't seem to map very well to a spatial join. > > Yet another suggestion was to use some kind of temporary index. Some > brief experiments I did indicated that it would be fairly slow (though > more investigation might be useful here). Also, it doesn't provide any > alternative to the nestloop-with-inner-index we already offer at the > leaf level today. > > Regards, > Jeff Davis perf.sql Description: application/sql diff --git a/doc/src/sgml/rangetypes.sgml b/doc/src/sgml/rangetypes.sgml index 9557c16..84578a7 100644 *** a/doc/src/sgml/rangetypes.sgml --- b/doc/src/sgml/rangetypes.sgml *** *** 522,525 INSERT 0 1 --- 522,595 + + Range Join + + + range type + join + + + + A Range Join is a join where one of the join conditions is the range + overlaps operator (see ). In other + words, rather than returning rows where corresponding fields are equal, it + returns rows where the corresponding fields overlap. + + + For instance, to find users who were active on a website at the same time + as they were using a mobile app, you might issue a query like: + + CREATE TABLE web_session( + web_session_id text primary key, + username text, + during tstzrange); + CREATE TABLE app_session( + app_session_id text primary key, + username text, + during tstzrange); + INSERT INTO web_session VALUES + ('0cc175b9c0f1b6a8', 'user1', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('92eb5ffee6ae2fec', 'user2', '[2017-02-01 09:30, 2017-02-01 10:45)'), + ('4a8a08f09d37b737', 'user3', '[2017-02-01 09:30, 2017-02-01
[HACKERS] Range Merge Join v1
Example: Find different people using the same website at the same time: create table session(sessionid text, username text, during tstzrange); SELECT s1.username, s2.username, s1.during * s2.during FROM session s1, session s2 WHERE s1.during && s2.during AND s1.username < s2.username = Brief summary of previous discussion: = http://www.postgresql.org/message-id/1334554850.10878.52.camel@jdavis - can indexes solve it already (parameterized paths, etc.)? - planner complexity (track path keys, mergejoinable equality, etc.) - spatial join algorithm choice - compare range merge join against existing indexes to see if any indexing approach is viable == Externals: == No new syntax or other externals. Just improved execution strategy for joining ranges using the overlaps operator (&&). New syntax is possible, but I don't see a strong reason to support special range join syntax at this time. = Optimizer Design = I took the path of least resistance, so if I am doing something wrong please let me know. I hardwired it to look for the range overlaps operator when checking if it could do a merge join, and then add range_ops to restrictinfo->mergeopfamilies. Then, I have to suppress adding it as an equivalence class, because overlaps is not equality. It still adds single-member ECs to restrictinfo->right_ec and left_ec, but (I think) that's OK. Also, I have to prevent generating paths for right/full join, because the range join algorithm can't (currently) handle those. = Costing = Needs more consideration. Seems to do reasonable things in the few examples I tried. = Executor Design = See detailed comments in nodeMergejoin.c = Performance = Seems much better than index nest loop join when the join is not selective. I will post detailed numbers soon. If no index is available, range merge join is the only reasonable way to execute a range join. For instance, if the inner is not a leaf in the plan. = Alternatives: = It was suggested that I approach it as a general spatial-join problem. That has merit, but I rejected it for now because the algorithm that I think has the most promise is range-oriented anyway. By that I mean we would need to extract all of the dimensions into ranges first, and then perform the join. With that in mind, it makes sense to implement range joins first; and then later provide the APIs to get at the spatial dimensions so that we can implement other spatial joins as range joins. Another suggestion was to base it on hash join, but I never understood that proposal and it didn't seem to map very well to a spatial join. Yet another suggestion was to use some kind of temporary index. Some brief experiments I did indicated that it would be fairly slow (though more investigation might be useful here). Also, it doesn't provide any alternative to the nestloop-with-inner-index we already offer at the leaf level today. Regards, Jeff Davis diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a18ab43..1110c1e 100644 *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *** *** 909,915 ExplainNode(PlanState *planstate, List *ancestors, pname = sname = "Nested Loop"; break; case T_MergeJoin: ! pname = "Merge"; /* "Join" gets added by jointype switch */ sname = "Merge Join"; break; case T_HashJoin: --- 909,919 pname = sname = "Nested Loop"; break; case T_MergeJoin: ! if (((MergeJoin*)plan)->mergeRangeJoin) ! pname = "Range Merge"; /* "Join" gets added by jointype switch */ ! else ! pname = "Merge"; /* "Join" gets added by jointype switch */ ! sname = "Merge Join"; break; case T_HashJoin: diff --git a/src/backend/executor/nodindex 62784af..f3375c3 100644 *** a/src/backend/executor/nodeMergejoin.c --- b/src/backend/executor/nodeMergejoin.c *** *** 89,102 --- 89,151 * proceed to another state. This state is stored in the node's * execution state information and is preserved across calls to * ExecMergeJoin. -cim 10/31/89 + * + * RANGE MERGE JOIN + * + * Range Merge Join is a generalization of merge join. When a join + * predicate involves the range overlaps operator (&&), a merge join + * becomes a range join. + * + * The input ranges must be ordered by (lower-bound, upper-bound), which + * is the ordinary range_ops operator class. MJCompare must not simply + * use the operator class's comparison semantics though; instead it + * follows these rules: + * + * - return 0 if the outer range overlaps the inner range + * - return <0 if the outer range is strictly left-of the inner range + * - return >0 if the outer range is strictly right-of the inner range +