Re: [PERFORM] GiST indexing tuples

2007-11-30 Thread Matthew
On Thu, 29 Nov 2007, Matthew T. O'Connor wrote:
 Matthew wrote:
  For instance, the normal B-tree index on (a, b) is able to answer queries
  like a = 5 AND b  1 or a  5. An R-tree would be able to index these,
  plus queries like a  5 AND b  1.

 Sorry in advance if this is a stupid question, but how is this better
 than two index, one on a and one on b?  I supposed there could be a
 space savings but beyond that?

Imagine you have a table with columns a and b. The table has a
bazillion rows, and the values of a and b both range from a negative
bazillion to a positive bazillion. (Note this is exactly the case in our
database, for some value of a bazillion.) Then, you run the query:

SELECT * FROM table WHERE a  5 AND b  1;

So, an index on a will return half a bazillion results for the
constraint a  5. Likewise, the index on b will return half a
bazillion results for the constraint b  1. However, the intersection of
these two constraints could be just a few rows. (Note this is exactly the
case in our database.)

Now, Postgres has two options. It could use just the one index and filter
half a bazillion rows (which is what it used to do), or it could create a
bitmap with a bazillion bits from each index, and do a logical AND
operation on them to create a new bitmap with just a few bits set (which
it now can do under some circumstances). Either way, it's going to be a
heavy operation.

An R-tree index on a, b would instantly return just those few rows,
without using significant amounts of memory or time.

Hope that helps,

Matthew

-- 
Patron: I am looking for a globe of the earth.
Librarian: We have a table-top model over here.
Patron: No, that's not good enough. Don't you have a life-size?
Librarian: (pause) Yes, but it's in use right now.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] GiST indexing tuples

2007-11-29 Thread Steinar H. Gunderson
On Thu, Nov 29, 2007 at 03:23:10PM -0500, Matthew T. O'Connor wrote:
 Sorry in advance if this is a stupid question, but how is this better than 
 two index, one on a and one on b?  I supposed there could be a space 
 savings but beyond that?

You could index on both columns simultaneously without a bitmap index scan.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] GiST indexing tuples

2007-11-29 Thread Matthew T. O'Connor

Matthew wrote:

For instance, the normal B-tree index on (a, b) is able to answer queries
like a = 5 AND b  1 or a  5. An R-tree would be able to index these,
plus queries like a  5 AND b  1.


Sorry in advance if this is a stupid question, but how is this better 
than two index, one on a and one on b?  I supposed there could be a 
space savings but beyond that?



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Matthew
On Tue, 27 Nov 2007, Steinar H. Gunderson wrote:
 On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote:
  SELECT * FROM table WHERE a  1 AND b  4;

 This sounds like something an R-tree can do.

I *know* that. However, Postgres (as far as I can see) doesn't provide a
simple R-tree index that will index two integers. This means I have to
write one myself. I'm asking whether it is possible to get two values into
a GiST index, which would allow me to implement this.

Matthew

-- 
It is better to keep your mouth closed and let people think you are a fool
than to open it and remove all doubt.  -- Mark Twain

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Gregory Stark
Matthew [EMAIL PROTECTED] writes:

 On Tue, 27 Nov 2007, Steinar H. Gunderson wrote:
 On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote:
  SELECT * FROM table WHERE a  1 AND b  4;

 This sounds like something an R-tree can do.

 I *know* that. However, Postgres (as far as I can see) doesn't provide a
 simple R-tree index that will index two integers. This means I have to
 write one myself. I'm asking whether it is possible to get two values into
 a GiST index, which would allow me to implement this.

The database is capable of determining that a1 and b4 are both conditions
which a single index can satisfy.

However GIST itself treats each column of the index independently applying the
first column then the second one and so on like a traditional btree index, so
it doesn't really do what you would want.

I did propose a while back that GIST should consider all columns
simultaneously in the same style as rtree. 

However this would require making GIST somewhat less general in another sense.
Currently page splits can be handled arbitrarily but if you have to be able to
combine different datatypes it would mean having to come up with a standard
algorithm which would work everywhere. (I suggested making everything work in
terms of distance and then using the n-space vector distance (ie
sqrt((a1-b1)^2+(a2-b2)^2+...).) This means GIST wouldn't be as general as
it is now but it would allow us to handle cases like yours automatically.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Tom Lane
Matthew [EMAIL PROTECTED] writes:
 This sounds like something an R-tree can do.

 I *know* that. However, Postgres (as far as I can see) doesn't provide a
 simple R-tree index that will index two integers. This means I have to
 write one myself. I'm asking whether it is possible to get two values into
 a GiST index, which would allow me to implement this.

Have you looked at contrib/seg/ ?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] GiST indexing tuples

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote:
 SELECT * FROM table WHERE a  1 AND b  4;

This sounds like something an R-tree can do.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster