Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2013-01-18 Thread Jeff Davis
On Thu, 2013-01-17 at 21:03 +0100, Stefan Keller wrote:
 Hi Jeff

 I'm perhaps really late in this discussion but I just was made aware
 of that via the tweet from Josh Berkus about PostgreSQL 9.3: Current
 Feature Status
 
 What is the reason to digg into spatial-joins when there is PostGIS
 being a bullet-proof and fast implementation?
 

Hi Stefan,

You are certainly not too late.

PostGIS uses the existing postgres infrastructure to do spatial joins.
That mean it either does a cartesian product and filters the results, or
it uses a nested loop with an inner index scan.

That isn't too bad, but it could be better. I am trying to introduce a
new way to do spatial joins which will perform better in more
circumstances. For instance, we can't use an inner index if the input
tables are actually subqueries, because we can't index a subquery.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2013-01-18 Thread Stefan Keller
Hi Jeff

2013/1/18 Jeff Davis pg...@j-davis.com:
 On Thu, 2013-01-17 at 21:03 +0100, Stefan Keller wrote:
 Hi Jeff

 I'm perhaps really late in this discussion but I just was made aware
 of that via the tweet from Josh Berkus about PostgreSQL 9.3: Current
 Feature Status

 What is the reason to digg into spatial-joins when there is PostGIS
 being a bullet-proof and fast implementation?


 Hi Stefan,

 You are certainly not too late.

 PostGIS uses the existing postgres infrastructure to do spatial joins.
 That mean it either does a cartesian product and filters the results, or
 it uses a nested loop with an inner index scan.

 That isn't too bad, but it could be better. I am trying to introduce a
 new way to do spatial joins which will perform better in more
 circumstances. For instance, we can't use an inner index if the input
 tables are actually subqueries, because we can't index a subquery.

 Regards,
 Jeff Davis

Sounds good.
Did you already had contact e.g. with Paul (cc'ed just in case)?
And will this clever index also be available within all these hundreds
of PostGIS functions?

Regards, Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2013-01-18 Thread Jeff Davis
On Fri, 2013-01-18 at 12:25 +0100, Stefan Keller wrote:
 Sounds good.
 Did you already had contact e.g. with Paul (cc'ed just in case)?
 And will this clever index also be available within all these hundreds
 of PostGIS functions?

Yes, I've brought the idea up to Paul before, but thank you.

It's not an index exactly, it's a new join algorithm that should be more
efficient for spatial joins. That being said, it's not done, so it could
be an index by the time I'm finished ;)

When it is done, it will probably need some minor work in PostGIS to
make use of it. But that work is done at the datatype level, and PostGIS
only has a couple data types, so I don't think it will be a lot of work.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2013-01-17 Thread Stefan Keller
Hi Jeff

2012/4/19 Jeff Davis pg...@j-davis.com:
 On Wed, 2012-04-18 at 01:21 -0400, Tom Lane wrote:
(...)
 This is just handwaving of course.  I think some digging in the
 spatial-join literature would likely find ideas better than any of
 these.

 I will look in some more detail. The merge-like approach did seem to be
 represented in the paper referenced by Alexander (the external plane
 sweep), but it also refers to several methods based on partitioning.

 I'm beginning to think that more than one of these ideas has merit.

 Regards,
 Jeff Davis

I'm perhaps really late in this discussion but I just was made aware
of that via the tweet from Josh Berkus about PostgreSQL 9.3: Current
Feature Status

What is the reason to digg into spatial-joins when there is PostGIS
being a bullet-proof and fast implementation?

Yours, Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-19 Thread Jeff Davis
On Tue, 2012-04-17 at 14:24 -0400, Robert Haas wrote:
 I thought Jeff was parenthetically complaining about cases like A LEFT
 JOIN (B INNER JOIN C ON b.y = c.y) ON a.x  b.x.  That presumably
 would require the parameterized-path stuff to have any chance of doing
 partial index scans over B.  However, I understand that's not the main
 issue here.

To take the mystery out of it, I was talking about any case where an
index scan is impossible or impractical. For instance, let's say the
ranges are computed values. Just to make it really impossible, let's say
the ranges are computed from columns in two different tables joined in a
subquery.

But yes, the ability of the planner to find the plan is also an issue
(hopefully less of one with the recent improvements).

 One thing that I think needs some analysis is when the range join idea
 is better or worse than a nested loop with inner index-scan, because
 potentially those are the options the planner has to choose between,
 and the costing model had better know enough to make the right thing
 happen.  It strikes me that the nested loop with inner index-scan is
 likely to be a win when there are large chunks of the indexed relation
 that the nestloop never needs to visit at all - imagine small JOIN big
 ON small.a  big.a, for example.  I suppose the really interesting
 question is how much we can save when the entirety of both relations
 has to be visited anyway - it seems promising, but I guess we won't
 know for sure without testing it.

Right, I will need to come up with a prototype that can at least test
the executor piece. I suspect that the plan choice won't be all that
different from an ordinary index nestloop versus mergejoin case, but
with much worse cardinality estimates to work with.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-19 Thread Jeff Davis
On Tue, 2012-04-17 at 14:03 -0400, Robert Haas wrote:
 I'm actually not sure these are equivalent formulations.  Suppose one
 side has [i,i] where i ranges from 1 to 1000 and the other side
 the exact same thing plus [1,1000].  That one really big range
 will come up second on the right side, and you will not be able to
 discard it until you reach the end of the join.  If you just keep
 rewinding the right side, you're going to end up with O(n^2) behavior,
 whereas if you can discard tuples from the middle on the right side,
 then you will get O(n) behavior, which is a big difference.  In other
 words, in your original algorithm the tuples that you discard in step
 4 or 5 need not be the first remaining tuple on whichever side of the
 join we're talking about.

To illustrate the problem (slightly modified), I'll write the sets out
verbatim rather than use range syntax:

  {1,2,3}  {1,2,3,4,5,6,7,8,9}
  {  2,3,4}{  2,3,4}. . . . .
  {3,4,5}  {3,4,5}. . . .
  {  4,5,6}{  4,5,6}. . .
  {5,6,7}  {5,6,7}. .
  {  6,7,8}{  6,7,8}.
  {7,8,9}  {7,8,9}

The . are supposed to represent a shadow that the large range [1,9]
casts below it. This shadow prevents the discarding of [2,4] on the RHS
even when processing range [5,7] on the LHS, because we can't discard
out of the middle.

Note that, if you just have some large ranges and a lot of overlap,
that's not really a problem with the algorithm, it's just a large result
to compute. The problem comes when the ranges vary in size by quite a
lot, and there are many ranges that could be eliminated but can't
because of the shadow.

This problem can be mitigated substantially, I believe. Let me change
the algorithm (I don't like the way the pseudocode was structured
anyway), and word it a little more loosely:

1. Look at the top ranges on each side. Choose the one with the greater
upper bound, and call that Range1 from Side1, and the other range R2
from Side2. If either Range1 or Range2 is empty, terminate.
2. Scan down Side2, discarding ranges that are strictly before, and
joining with ranges that overlap, stopping when you hit a range that is
strictly after.
3. Now, discard Range1, and reset Side2 to the first non-discarded
range. Goto 1.

The benefit is, in step 1, we choose a side that will *always* discard
the top tuple. And if we choose the one with the greater upper bound,
then we are going to eliminate the largest shadow.

That doesn't eliminate the problem entirely, but it seems like it would
reduce it a lot.

Regarding the idea of discarding tuples in the middle, that might be an
interesting approach as well. It might be as simple as setting a flag in
the tuple header (like was done for full hash joins). Still not perfect,
but would make redundant checks very cheap. Combined with my strategy,
there's a good chance that we practically eliminate the problem.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-19 Thread Jeff Davis
On Wed, 2012-04-18 at 01:21 -0400, Tom Lane wrote:
 It would be a pretty weird implementation of mergejoin that could
 discard tuples from the middle of an input stream.  Or to be more
 specific, it wouldn't be the mergejoin itself that could do that at all
 --- you'd need the input plan node to be some sort of tweaked version of
 tuplestore or tuplesort that could respond to a request like that.

As I said in my reply to Robert, I think there are some ways we can make
this idea work.

 I can't escape the feeling that Jeff has chosen the wrong basis for his
 thought experiment, and that what he really ought to be thinking about
 is hashjoin, which keeps an in-memory table that it could easily modify
 on the fly if it chose to.  The multi-batch aspect of hybrid hashjoin
 could be useful too (IOW, when you're out of better ideas, throw the
 tuple back in the water to process later).

Obviously hashing is not going to be much use for anything but equality.
So I believe this approach is very similar to the temporary-index
method, except with batching, and always keeping the index in memory.

I don't think we would get the partitioning benefit of hashjoin, because
we'd have to put the same tuple in multiple partitions, so it's probably
better to just leave the outer side intact.

But in-memory indexes and multiple passes of the outer seems like a
reasonable alternative, particularly because an in-memory index might be
very fast (to build as well as query).

 This is just handwaving of course.  I think some digging in the
 spatial-join literature would likely find ideas better than any of
 these.

I will look in some more detail. The merge-like approach did seem to be
represented in the paper referenced by Alexander (the external plane
sweep), but it also refers to several methods based on partitioning.

I'm beginning to think that more than one of these ideas has merit.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-17 Thread Greg Stark
On Mon, Apr 16, 2012 at 10:20 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The thing I like most about temp indexes is that they needn't be temporary.

 I'd like to see something along the lines of demand-created optional
 indexes, that we reclaim space/maintenance overhead on according to
 some cache management scheme. More space you have, the more of the
 important ones hang around. The rough same idea applies to
 materialised views.

I find this a) really scary, b) a huge increase in scope, and c) kind
of pointless.

a) DDL and shared caches require all kinds of additional
synchronization that would be really annoying to be incurring in the
middle of DML queries unexpectedly. If these indexes are to be useful
for other transactions they would also impose a hidden and
uncontrollable cost on all updates and inserts on the table.

b) I think it would be a lot harder to get working cleanly. You would
need to invent a UI to control the lifetime and resources used by
these objects and deal with duplication between manually created and
dynamically created objects. It's fundamentally a huge shift in the
design of the database changing what things the user maintains as part
of their schema and what things the database maintains transparently.
This is a big project on its own aside from the technical side of
things.

c) If these indexes are useful for many queries then the user can
choose to create them themself. These indexes you're proposing would
be just as expensive as any those indexes and offer few advantages
aside from their automaticness. The same could be accomplished with
some external demon that just looked at the query logs and determined
which indexes to create or not.

The main advantage of creating dynamic indexes as part of the query
would be lost. Namely that these would be local data structures that
don't need to be synchronized with other transactions and don't need
to be updated by other transactions. They're just part of the query
execution the way spilled hash tables and tuplesort tapes are. You
could build indexes on materialized data resulting from earlier joins
or aggregates and so on.

The point is that if you make them equivalent to normal indexes just
dynamically maintained then all you've done is change the way normal
indexes work but haven't really changed the set of queries they're
useful for. That might be a neat UI Feature but If you want to change
the set of queries postgres can handle efficiently at all then you
need something that's fundamentally different from a table index.

As an aside I think some of what you're looking for could be better
handled with some kind of query result cache that could keep around
the materialized data from plan nodes until an event happens that
changes the data. This might be related to the underlying
infrastructure needed for materialized views but I haven't thought too
deeply about it. It seems like a lot of work and a big change from the
current very isolated per-process execution model.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-17 Thread Robert Haas
On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
   1. Order the ranges on both sides by the lower bound, then upper bound.
  Empty ranges can be excluded entirely.
   2. Left := first range on left, Right := first range on right
   3. If Left or Right is empty, terminate.
   4. If lower(Left)  upper(Right), discard Right, goto 2
   5. If lower(Right)  upper(Left), discard Left, goto 2
   6. return (Left, Right) as joined tuple
   7. Right := next range on right
   8. goto 3

 This is surely not correct in detail.  As written, it will be impossible
 for any tuple on the right side to be joined to more than one left-side
 tuple.  You will need something analogous to the mark-and-restore
 rewinding logic in standard mergejoin to make this work.

 Every time you discard a tuple on the left, you go to step 2, which
 rewinds the right side back to the first non-discarded tuple.

 So, implemented using mark and restore:

  * start off with the marks at the first tuple on each side
  * discard means move the mark down a tuple
  * setting it back to the first range means restoring to the mark
  * going to the next range (step 7) just means getting another
    tuple, without changing the mark

 Only one side really needs the mark and restore logic, but it was easier
 to write the pseudocode in a symmetrical way (except step 7).

I'm actually not sure these are equivalent formulations.  Suppose one
side has [i,i] where i ranges from 1 to 1000 and the other side
the exact same thing plus [1,1000].  That one really big range
will come up second on the right side, and you will not be able to
discard it until you reach the end of the join.  If you just keep
rewinding the right side, you're going to end up with O(n^2) behavior,
whereas if you can discard tuples from the middle on the right side,
then you will get O(n) behavior, which is a big difference.  In other
words, in your original algorithm the tuples that you discard in step
4 or 5 need not be the first remaining tuple on whichever side of the
join we're talking about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-17 Thread Robert Haas
On Mon, Apr 16, 2012 at 7:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hmm. This sounds like something that Tom's recent work on
 parameterized plans ought to have fixed, or if not, it seems closely
 related.

 Not really.  It's still going to be a nestloop, and as such not terribly
 well suited for queries where there are a lot of matchable rows on both
 sides.  The work I've been doing is really about making nestloops usable
 in cases where join order restrictions formerly prevented it --- but
 Jeff's complaint has nothing to do with that.  (This thought also makes
 me a bit dubious about the nearby suggestions that more indexes will
 fix it.)

I thought Jeff was parenthetically complaining about cases like A LEFT
JOIN (B INNER JOIN C ON b.y = c.y) ON a.x  b.x.  That presumably
would require the parameterized-path stuff to have any chance of doing
partial index scans over B.  However, I understand that's not the main
issue here.

One thing that I think needs some analysis is when the range join idea
is better or worse than a nested loop with inner index-scan, because
potentially those are the options the planner has to choose between,
and the costing model had better know enough to make the right thing
happen.  It strikes me that the nested loop with inner index-scan is
likely to be a win when there are large chunks of the indexed relation
that the nestloop never needs to visit at all - imagine small JOIN big
ON small.a  big.a, for example.  I suppose the really interesting
question is how much we can save when the entirety of both relations
has to be visited anyway - it seems promising, but I guess we won't
know for sure without testing it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis pg...@j-davis.com wrote:
 ... Only one side really needs the mark and restore logic, but it was easier
 to write the pseudocode in a symmetrical way (except step 7).

 I'm actually not sure these are equivalent formulations.  Suppose one
 side has [i,i] where i ranges from 1 to 1000 and the other side
 the exact same thing plus [1,1000].  That one really big range
 will come up second on the right side, and you will not be able to
 discard it until you reach the end of the join.  If you just keep
 rewinding the right side, you're going to end up with O(n^2) behavior,
 whereas if you can discard tuples from the middle on the right side,
 then you will get O(n) behavior, which is a big difference.  In other
 words, in your original algorithm the tuples that you discard in step
 4 or 5 need not be the first remaining tuple on whichever side of the
 join we're talking about.

It would be a pretty weird implementation of mergejoin that could
discard tuples from the middle of an input stream.  Or to be more
specific, it wouldn't be the mergejoin itself that could do that at all
--- you'd need the input plan node to be some sort of tweaked version of
tuplestore or tuplesort that could respond to a request like that.

I can't escape the feeling that Jeff has chosen the wrong basis for his
thought experiment, and that what he really ought to be thinking about
is hashjoin, which keeps an in-memory table that it could easily modify
on the fly if it chose to.  The multi-batch aspect of hybrid hashjoin
could be useful too (IOW, when you're out of better ideas, throw the
tuple back in the water to process later).

This is just handwaving of course.  I think some digging in the
spatial-join literature would likely find ideas better than any of
these.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Darren Duncan
Your proposal makes me think of something similar which might be useful, 
INclusion constraints.  As exclusion constraints might be thought of like a 
generalization of unique/key constraints, inclusion constraints are like a 
generalization of foreign key constraints.  The inclusion constraints 
basically allow some comparison operator other than is-equal to test if values 
in one table match values in another table, and the constraint allows the former 
if the test results in true.  An example of said inclusion test is whether the 
range in one table is contained in a range in another table.  I assume/hope 
that, similarly, now that we have range types in 9.2, that the existing 
exclusion constraints can be used with range comparison operators.


As to your actual proposal, it sounds like a generalization of the relational 
join or set intersection operator where instead of comparing sets defined in 
terms of an enumeration of discrete values we are comparing sets defined by a 
range, which conceptually have infinite values depending on the data type the 
range is defined over.  But if we're doing this, then it would seem to make 
sense to go further and see if we have set analogies for all of our relational 
or set operators, should we want to do work with non-discrete sets.


Now this sounds interesting in theory, but I would also assume that these could 
be implemented by an extension in terms of existing normal relational operators, 
where each range value is a discrete value, combined with operators for unioning 
or differencing etc ranges.  A relation of ranges effectively can represent a 
discontinuous range; in that case, the empty discontinuous range is also 
canonically representable by a relation with zero tuples.


Jeff, I get the impression your proposal is partly about helping performance by 
supporting this internally, rather than one just defining it as a SQL function, 
am I right?


-- Darren Duncan

Jeff Davis wrote:

I hope this is not an inappropriate time for 9.3 discussions. The flip
side of asking for submissions in the first couple commitfests means
that I need to submit proposals now.

What is a Range Join?

See attached SQL for example. The key difference is that the join
condition is not equality, but overlaps ().

Problem statement: slow. Nested loops are the only option, although they
can benefit from an inner GiST index if available. But if the join is
happening up in the plan tree somewhere, then it's impossible for any
index to be available.

Proposed solution: a modified merge join that can handle ranges.

 1. Order the ranges on both sides by the lower bound, then upper bound.
Empty ranges can be excluded entirely.
 2. Left := first range on left, Right := first range on right
 3. If Left or Right is empty, terminate.
 4. If lower(Left)  upper(Right), discard Right, goto 2
 5. If lower(Right)  upper(Left), discard Left, goto 2
 6. return (Left, Right) as joined tuple
 7. Right := next range on right
 8. goto 3

If we get step 4 or step 5 keeps getting triggered, and a btree index is
available (ordered by lower bound), we can re-probe to go to the correct
position, and consider that the new top range on that side. This is an
optimization for the case where there are large numbers of ranges with
no match on the other side.

Thanks to Nathan Boley for helping me devise this algorithm. However,
any bugs are mine alone ;)

Weaknesses: I haven't thought through the optimization, but I suspect it
will be hard to be very accurate in the costing. That might be OK,
because there aren't very many options anyway, but I'll need to think it
through.

Questions:

 * Is this idea sane? -- that is, are ranges important enough that
people are willing to maintain a new operator?
 * The more general problem might be spatial joins which can operate
in N dimensions, and I doubt this would work very well in that case.
Does someone know of a spatial join algorithm (without IP claims) that
would be as good as this one for ranges?
 * Other thoughts?

Regards,
Jeff Davis


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Heikki Linnakangas

On 16.04.2012 08:40, Jeff Davis wrote:

Does someone know of a spatial join algorithm (without IP claims) that
would be as good as this one for ranges?


I'd be happy with an algorithm that's specific to ranges, too, but my 
gut geeling is that there has to be a lot of research on spatial join 
algorithms out there. Implementing one of them would be more widely 
applicable, so I think we should look into spatial join algorithms first 
and see how far that gets us, before we write something specific to ranges.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Proposed solution: a modified merge join that can handle ranges.

  1. Order the ranges on both sides by the lower bound, then upper bound.
 Empty ranges can be excluded entirely.
  2. Left := first range on left, Right := first range on right
  3. If Left or Right is empty, terminate.
  4. If lower(Left)  upper(Right), discard Right, goto 2
  5. If lower(Right)  upper(Left), discard Left, goto 2
  6. return (Left, Right) as joined tuple
  7. Right := next range on right
  8. goto 3

This is surely not correct in detail.  As written, it will be impossible
for any tuple on the right side to be joined to more than one left-side
tuple.  You will need something analogous to the mark-and-restore
rewinding logic in standard mergejoin to make this work.

I can't escape the feeling that we could do this with more-or-less
standard mergejoin logic paying attention to some fuzzy notion of
equality, with the question of whether a given pair of ranges actually
overlap being deferred to a filter condition.  Not quite sure how to
make that work though.

  * Is this idea sane? -- that is, are ranges important enough that
 people are willing to maintain a new operator?

Dunno.  It might be easier to sell the idea of adding support for range
joins in a couple of years, after we've seen how much use ranges get.

I will note that mergejoin is not only one of the more complicated
executor modules, but it accounts for a huge percentage of the planner's
complexity; which doesn't exactly make me sympathize with the notion of
let's-copy-and-paste-all-that.  It'd be a lot better if we could build
it as a small tweak to mergejoin rather than an independent thing.

(Having said that, I have no idea how the planner support could work
at all, because its treatment of mergejoins is intimately tied to
mergejoinable equality and EquivalenceClasses and PathKeys, which don't
seem readily extensible to the range situation.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Simon Riggs
On Mon, Apr 16, 2012 at 7:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Dunno.  It might be easier to sell the idea of adding support for range
 joins in a couple of years, after we've seen how much use ranges get.

Once we've started the journey towards range types we must complete it
reasonably quickly.

Having partially implemented, unoptimised features is the same as not
having the feature at all, so it will remain unused until it really
works. We could say that about many features, but if Jeff is
championing this, I'd say go for it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Alexander Korotkov
On Mon, Apr 16, 2012 at 10:29 AM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 On 16.04.2012 08:40, Jeff Davis wrote:

 Does someone know of a spatial join algorithm (without IP claims) that
 would be as good as this one for ranges?


 I'd be happy with an algorithm that's specific to ranges, too, but my gut
 geeling is that there has to be a lot of research on spatial join
 algorithms out there. Implementing one of them would be more widely
 applicable, so I think we should look into spatial join algorithms first
 and see how far that gets us, before we write something specific to ranges.


There is a good overview article about spatial joins.
http://www.cs.umd.edu/users/hjs/pubs/jacoxtods07.pdf
It shows that there is a lot of methods based on building temporaty
indexes. It might mean that building of temporary GiST index is not a bad
idea.
Also there are methods which looks like extension of Jeff Davis proposal to
the multidimensional case. It is Plane Sweep and External Plane Sweep
methods. However, it might use sophisticated data structures for the
sweep. And I believe it should use it for good performance.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Jeff Davis
On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
   1. Order the ranges on both sides by the lower bound, then upper bound.
  Empty ranges can be excluded entirely.
   2. Left := first range on left, Right := first range on right
   3. If Left or Right is empty, terminate.
   4. If lower(Left)  upper(Right), discard Right, goto 2
   5. If lower(Right)  upper(Left), discard Left, goto 2
   6. return (Left, Right) as joined tuple
   7. Right := next range on right
   8. goto 3
 
 This is surely not correct in detail.  As written, it will be impossible
 for any tuple on the right side to be joined to more than one left-side
 tuple.  You will need something analogous to the mark-and-restore
 rewinding logic in standard mergejoin to make this work.

Every time you discard a tuple on the left, you go to step 2, which
rewinds the right side back to the first non-discarded tuple.

So, implemented using mark and restore:

  * start off with the marks at the first tuple on each side
  * discard means move the mark down a tuple
  * setting it back to the first range means restoring to the mark
  * going to the next range (step 7) just means getting another
tuple, without changing the mark

Only one side really needs the mark and restore logic, but it was easier
to write the pseudocode in a symmetrical way (except step 7).

 I will note that mergejoin is not only one of the more complicated
 executor modules, but it accounts for a huge percentage of the planner's
 complexity; which doesn't exactly make me sympathize with the notion of
 let's-copy-and-paste-all-that.  It'd be a lot better if we could build
 it as a small tweak to mergejoin rather than an independent thing.
 
 (Having said that, I have no idea how the planner support could work
 at all, because its treatment of mergejoins is intimately tied to
 mergejoinable equality and EquivalenceClasses and PathKeys, which don't
 seem readily extensible to the range situation.)

I think this is the more important problem area. It's clear that I'll
need to do some more investigation into the planning.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Jeff Davis
On Mon, 2012-04-16 at 16:22 +0400, Alexander Korotkov wrote:

 There is a good overview article about spatial joins.
 http://www.cs.umd.edu/users/hjs/pubs/jacoxtods07.pdf 

Thank you, that's exactly the kind of overview I was looking for.

 It shows that there is a lot of methods based on building temporaty
 indexes. It might mean that building of temporary GiST index is not a
 bad idea.

That had occurred to me, but I was hesitant to only use temp indexes. It
still doesn't really offer a good solution when both sides of the join
are relatively large (because of random I/O). Also the build speed of
the index would be more important than it is now.

On the other hand, if I tackle the problem using temp indexes, it would
be a more general solution useful for many problems (for instance, we
really need a better solution when the result of a subquery doesn't fit
in a work_mem-sized hashtable).

We may end up with some kind of combination (as the sweep seems to be;
see below).

 Also there are methods which looks like extension of Jeff Davis
 proposal to the multidimensional case. It is Plane Sweep and
 External Plane Sweep methods. However, it might use sophisticated data
 structures for the sweep. And I believe it should use it for good
 performance.

That method looks closer to what I had in mind.

My Range Join proposal is essentially the same as the sweep method
except that it only joins on one dimension, and the rest of the
dimensions have to be checked with RECHECK. So, this still works for an
N-dimensional join, as long as a single dimension is fairly selective.

The sweep method seems to do the first dimension with the sweep method,
and maintains a changing index that it uses to do the lookup. In other
words, it's essentially just a way to do the RECHECK on the other
dimensions in less than O(n) time. So, if the sweep dimension is not
selective at all, this degenerates to the temporary index method (or
something close, anyway).

The fact that, in the sweep method, there is still one special
dimension, makes me think that it will be easier to make the API work
based on ranges (which is a big win because postgres already knows about
ranges). If so, that makes it easier to divide the project into stages,
because we could get it working for ranges and then extend it to support
other dimensions more efficiently later.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Jeff Davis
On Sun, 2012-04-15 at 23:18 -0700, Darren Duncan wrote:
 Your proposal makes me think of something similar which might be useful, 
 INclusion constraints.  As exclusion constraints might be thought of like a 
 generalization of unique/key constraints, inclusion constraints are like a 
 generalization of foreign key constraints.  The inclusion constraints 
 basically allow some comparison operator other than is-equal to test if 
 values 
 in one table match values in another table, and the constraint allows the 
 former 
 if the test results in true.  An example of said inclusion test is whether 
 the 
 range in one table is contained in a range in another table.  I assume/hope 
 that, similarly, now that we have range types in 9.2, that the existing 
 exclusion constraints can be used with range comparison operators.

Yes, Range Foreign Keys are one of the loose ends for Range Types that
I'd like to wrap up.

 As to your actual proposal, it sounds like a generalization of the relational 
 join or set intersection operator where instead of comparing sets defined in 
 terms of an enumeration of discrete values we are comparing sets defined by a 
 range, which conceptually have infinite values depending on the data type the 
 range is defined over.  But if we're doing this, then it would seem to make 
 sense to go further and see if we have set analogies for all of our 
 relational 
 or set operators, should we want to do work with non-discrete sets.
 
 Now this sounds interesting in theory, but I would also assume that these 
 could 
 be implemented by an extension in terms of existing normal relational 
 operators, 
 where each range value is a discrete value, combined with operators for 
 unioning 
 or differencing etc ranges.  A relation of ranges effectively can represent a 
 discontinuous range; in that case, the empty discontinuous range is also 
 canonically representable by a relation with zero tuples.
 
 Jeff, I get the impression your proposal is partly about helping performance 
 by 
 supporting this internally, rather than one just defining it as a SQL 
 function, 
 am I right?

Per my proposal, the problem statement is that it's slow, so speeding
it up is really the entire proposal ;)

Extending the syntax might be interesting as well, but I don't have a
proposal for that.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Alexander Korotkov
On Tue, Apr 17, 2012 at 12:12 AM, Jeff Davis pg...@j-davis.com wrote:

 On Mon, 2012-04-16 at 16:22 +0400, Alexander Korotkov wrote:

  There is a good overview article about spatial joins.
  http://www.cs.umd.edu/users/hjs/pubs/jacoxtods07.pdf

 Thank you, that's exactly the kind of overview I was looking for.

  It shows that there is a lot of methods based on building temporaty
  indexes. It might mean that building of temporary GiST index is not a
  bad idea.

 That had occurred to me, but I was hesitant to only use temp indexes. It
 still doesn't really offer a good solution when both sides of the join
 are relatively large (because of random I/O). Also the build speed of
 the index would be more important than it is now.

Note, that amount of random I/O during GiST index build significanlty
dicreased after my GSoC project for buffering GiST index build. However,
GiST index build is still quite CPU-expensive. This problem could be evaded
by support of another methods of index build (another than producing a lot
of penalty and picksplit calls). Hilbert curve and similar methods could
help here. Implementation of such methods would require extension of GiST
interface.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Simon Riggs
On Mon, Apr 16, 2012 at 9:12 PM, Jeff Davis pg...@j-davis.com wrote:

 That had occurred to me, but I was hesitant to only use temp indexes. It
 still doesn't really offer a good solution when both sides of the join
 are relatively large (because of random I/O). Also the build speed of
 the index would be more important than it is now.

The thing I like most about temp indexes is that they needn't be temporary.

I'd like to see something along the lines of demand-created optional
indexes, that we reclaim space/maintenance overhead on according to
some cache management scheme. More space you have, the more of the
important ones hang around. The rough same idea applies to
materialised views.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Jay Levitt

Simon Riggs wrote:

I'd like to see something along the lines of demand-created optional
indexes, that we reclaim space/maintenance overhead on according to
some cache management scheme. More space you have, the more of the
important ones hang around. The rough same idea applies to
materialised views.


+10; this sort of demand-driven optimization could be the database 
equivalent of Java's HotSpot (which accomplished the amazing task of making 
Java kinda fastish sometimes).


Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Robert Haas
On Apr 16, 2012, at 1:40 AM, Jeff Davis pg...@j-davis.com wrote:
 See attached SQL for example. The 
 Problem statement: slow. Nested loops are the only option, although they
 can benefit from an inner GiST index if available. But if the join is
 happening up in the plan tree somewhere, then it's impossible for any
 index to be available.

Hmm. This sounds like something that Tom's recent work on parameterized plans 
ought to have fixed, or if not, it seems closely related. And by this I mean 
specifically the ability to use a GiST index to drive a nested loop that is 
higher up in the plan tree than the immediate parent of the index scan.

This is not an argument against your proposal, just an observation.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Apr 16, 2012, at 1:40 AM, Jeff Davis pg...@j-davis.com wrote:
 See attached SQL for example. The 
 Problem statement: slow. Nested loops are the only option, although they
 can benefit from an inner GiST index if available. But if the join is
 happening up in the plan tree somewhere, then it's impossible for any
 index to be available.

 Hmm. This sounds like something that Tom's recent work on
 parameterized plans ought to have fixed, or if not, it seems closely
 related.

Not really.  It's still going to be a nestloop, and as such not terribly
well suited for queries where there are a lot of matchable rows on both
sides.  The work I've been doing is really about making nestloops usable
in cases where join order restrictions formerly prevented it --- but
Jeff's complaint has nothing to do with that.  (This thought also makes
me a bit dubious about the nearby suggestions that more indexes will
fix it.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join

2012-04-16 Thread Jeff Davis
On Mon, 2012-04-16 at 22:20 +0100, Simon Riggs wrote:
 On Mon, Apr 16, 2012 at 9:12 PM, Jeff Davis pg...@j-davis.com wrote:
 
  That had occurred to me, but I was hesitant to only use temp indexes. It
  still doesn't really offer a good solution when both sides of the join
  are relatively large (because of random I/O). Also the build speed of
  the index would be more important than it is now.
 
 The thing I like most about temp indexes is that they needn't be temporary.
 
 I'd like to see something along the lines of demand-created optional
 indexes, that we reclaim space/maintenance overhead on according to
 some cache management scheme. More space you have, the more of the
 important ones hang around. The rough same idea applies to
 materialised views.

I think to make an informed decision, the next thing I need to do is
hack up a prototype version of my merge join idea, and see how well it
performs against an index nestloop today.

It seems to me that both approaches may have merit independently.

Regards,
Jeff Davis





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers