Re: [PERFORM] GiST indexing tuples
On Thu, 29 Nov 2007, Matthew T. O'Connor wrote: Matthew wrote: For instance, the normal B-tree index on (a, b) is able to answer queries like a = 5 AND b 1 or a 5. An R-tree would be able to index these, plus queries like a 5 AND b 1. Sorry in advance if this is a stupid question, but how is this better than two index, one on a and one on b? I supposed there could be a space savings but beyond that? Imagine you have a table with columns a and b. The table has a bazillion rows, and the values of a and b both range from a negative bazillion to a positive bazillion. (Note this is exactly the case in our database, for some value of a bazillion.) Then, you run the query: SELECT * FROM table WHERE a 5 AND b 1; So, an index on a will return half a bazillion results for the constraint a 5. Likewise, the index on b will return half a bazillion results for the constraint b 1. However, the intersection of these two constraints could be just a few rows. (Note this is exactly the case in our database.) Now, Postgres has two options. It could use just the one index and filter half a bazillion rows (which is what it used to do), or it could create a bitmap with a bazillion bits from each index, and do a logical AND operation on them to create a new bitmap with just a few bits set (which it now can do under some circumstances). Either way, it's going to be a heavy operation. An R-tree index on a, b would instantly return just those few rows, without using significant amounts of memory or time. Hope that helps, Matthew -- Patron: I am looking for a globe of the earth. Librarian: We have a table-top model over here. Patron: No, that's not good enough. Don't you have a life-size? Librarian: (pause) Yes, but it's in use right now. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] GiST indexing tuples
On Thu, Nov 29, 2007 at 03:23:10PM -0500, Matthew T. O'Connor wrote: Sorry in advance if this is a stupid question, but how is this better than two index, one on a and one on b? I supposed there could be a space savings but beyond that? You could index on both columns simultaneously without a bitmap index scan. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] GiST indexing tuples
Matthew wrote: For instance, the normal B-tree index on (a, b) is able to answer queries like a = 5 AND b 1 or a 5. An R-tree would be able to index these, plus queries like a 5 AND b 1. Sorry in advance if this is a stupid question, but how is this better than two index, one on a and one on b? I supposed there could be a space savings but beyond that? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] GiST indexing tuples
On Tue, 27 Nov 2007, Steinar H. Gunderson wrote: On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote: SELECT * FROM table WHERE a 1 AND b 4; This sounds like something an R-tree can do. I *know* that. However, Postgres (as far as I can see) doesn't provide a simple R-tree index that will index two integers. This means I have to write one myself. I'm asking whether it is possible to get two values into a GiST index, which would allow me to implement this. Matthew -- It is better to keep your mouth closed and let people think you are a fool than to open it and remove all doubt. -- Mark Twain ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] GiST indexing tuples
Matthew [EMAIL PROTECTED] writes: On Tue, 27 Nov 2007, Steinar H. Gunderson wrote: On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote: SELECT * FROM table WHERE a 1 AND b 4; This sounds like something an R-tree can do. I *know* that. However, Postgres (as far as I can see) doesn't provide a simple R-tree index that will index two integers. This means I have to write one myself. I'm asking whether it is possible to get two values into a GiST index, which would allow me to implement this. The database is capable of determining that a1 and b4 are both conditions which a single index can satisfy. However GIST itself treats each column of the index independently applying the first column then the second one and so on like a traditional btree index, so it doesn't really do what you would want. I did propose a while back that GIST should consider all columns simultaneously in the same style as rtree. However this would require making GIST somewhat less general in another sense. Currently page splits can be handled arbitrarily but if you have to be able to combine different datatypes it would mean having to come up with a standard algorithm which would work everywhere. (I suggested making everything work in terms of distance and then using the n-space vector distance (ie sqrt((a1-b1)^2+(a2-b2)^2+...).) This means GIST wouldn't be as general as it is now but it would allow us to handle cases like yours automatically. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] GiST indexing tuples
Matthew [EMAIL PROTECTED] writes: This sounds like something an R-tree can do. I *know* that. However, Postgres (as far as I can see) doesn't provide a simple R-tree index that will index two integers. This means I have to write one myself. I'm asking whether it is possible to get two values into a GiST index, which would allow me to implement this. Have you looked at contrib/seg/ ? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] GiST indexing tuples
On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote: SELECT * FROM table WHERE a 1 AND b 4; This sounds like something an R-tree can do. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster