Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jonah H. Harris
As an Oracle internals person myself, I don't see how making a comparison between the specifics of Oracle's MVCC to PostgreSQL's MVCC is relevant to this discussion.As does *MOST* other commercial databases, Oracle's storage manager performs an update-in-place whereas PostgreSQL's (for the most

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 01:56:51AM -0500, Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Christopher Kings-Lynne [EMAIL PROTECTED] writes: Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Greg Stark
Jonah H. Harris [EMAIL PROTECTED] writes: As an Oracle internals person myself, I don't see how making a comparison between the specifics of Oracle's MVCC to PostgreSQL's MVCC is relevant to this discussion. As does *MOST* other commercial databases, Oracle's storage manager performs an

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jonah H. Harris
On 19 Jan 2006 11:25:21 -0500, Greg Stark [EMAIL PROTECTED] wrote: Well it seems there were lots of facts posted. Yes you can avoid headachescaused by these issues, but we're not really talking about the headaches.Several were mentioned; some of which could generally be avoided by good tuning.

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Josh Berkus
Jonah, David has stated that the index to heap visibility check is slowing him down, so what are the possible options: - Visibility in indexes (-hackers archives cover the pros/cons) - True organized heaps - Block level index (Tom/Simon's earlier discussion) also - Frozen relations This

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread David Scott
Thanks for all the help and thought to our problem. Jonah H. Harris wrote: David has stated that the index to heap visibility check is slowing him down, so what are the possible options: - Visibility in indexes (-hackers archives cover the pros/cons) - True organized heaps - Block level

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread David Scott
Tom Lane wrote: What sort of problems are you dealing with exactly? There has been some discussion of changes that would improve certain scenarios. For instance it might be plausible to do joins using index information and only go back to the heap for entries that appear to pass the join

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 10:19:01AM -0800, Josh Berkus wrote: One of the other most valuable targets for index-only access is the many-to-many join table whose primary key consists of two (or more) foreign keys to two (or more) other tables. It's actually not necessary to check visibility

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 10:35:30AM -0800, David Scott wrote: Tom Lane wrote: What sort of problems are you dealing with exactly? There has been some discussion of changes that would improve certain scenarios. For instance it might be plausible to do joins using index information and only

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jeremy Drake
On Thu, 19 Jan 2006, Jim C. Nasby wrote: Do you still have that patch that folks could look at? ISTM that this technique would be rather dependant on your actual workload, and as such could result in a big win for certain types of queries. It is not a patch, per se. It is a c language

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Simon Riggs
On Wed, 2006-01-18 at 20:13 -0500, Tom Lane wrote: Come to think of it, the idea also seems to map nicely into bitmap index scans: the index will directly hand back a list of potential pages to look at, but they are all marked lossy because the index doesn't know exactly which tuple(s) on the

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 02:50:39PM -0800, Jeremy Drake wrote: On Thu, 19 Jan 2006, Jim C. Nasby wrote: Do you still have that patch that folks could look at? ISTM that this technique would be rather dependant on your actual workload, and as such could result in a big win for certain types

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: Basically, numbers talk. If there were convincing numbers for something that wasn't a corner-case that showed a marked improvement then there'd be much more interest in getting this into the backend in some fashion. I've got no doubt that there are *some*

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jeremy Drake
On Thu, 19 Jan 2006, Jim C. Nasby wrote: Feel free to do whatever with this, it's pretty fast for tables where seeks to validate tuples would hurt, but you do get back dead things... How'd you then weed out the dead tuples? I didn't get that far with it. The purpose of this function was

[HACKERS] No heap lookups on index

2006-01-18 Thread David Scott
Allow me a brief introduction. I work in a company who contracts intelligence analysis software to the government. We are currently developing a product which is using PostgreSQL at it's core. Due to the licensing of the product and the integration with perl this is our first choice in

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
David Scott [EMAIL PROTECTED] writes: Is the additional overhead of keeping full tuple visibility information inside of the index so odious to the Postgres community as to prevent a patch with this solution from being applied back to the head? This has been discussed and rejected

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Jonah H. Harris
David,You can find some of this discussion in Much Ado About COUNT(*). Related to that discussion, I had written a patch which added visibility information to the indexes.If you're interested in the patch and/or consulting, contact me offline. -JonahOn 1/18/06, Tom Lane [EMAIL PROTECTED] wrote:

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Simon Riggs
On Wed, 2006-01-18 at 12:14 -0800, David Scott wrote: Is the additional overhead of keeping full tuple visibility information inside of the index so odious to the Postgres community as to prevent a patch with this solution from being applied back to the head? Maybe as an optional use

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Jim C. Nasby
On Wed, Jan 18, 2006 at 12:14:12PM -0800, David Scott wrote: Do commercial databases implement MVCC in a way that allows an efficient implementation of index lookups that can avoid heap lookups? Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Jim C. Nasby
On Wed, Jan 18, 2006 at 04:02:45PM -0500, Jonah H. Harris wrote: David, You can find some of this discussion in Much Ado About COUNT(*). Related to that discussion, I had written a patch which added visibility information to the indexes. If you're interested in the patch and/or

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: You might want to consider the thought of organised heaps as an alternative thought to index improvements. That way there is no heap to avoid visiting because the index is also the main data structure. This would offer performance, but would be one of the

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Simon Riggs
On Wed, 2006-01-18 at 18:27 -0500, Tom Lane wrote: Imagine an index that contains only the upper levels of a search tree --- links to what would be the leaf level point into the associated heap. In this design the heap is still a heap in the sense that you can seqscan it without any

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: Hopefully we could avoid trying to support GIST-heaps? Well, that would be an extra index AM that someone might or might not get around to writing someday. I was thinking that both btree and hash index AMs might be interesting for this, though. Hash in

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: We only need to index the row with the lowest value on any page so the main index would get 100 times smaller. The main part of the index would not need to be written to except when a block overflows. BTW, the above is equivalent to saying that the

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Christopher Kings-Lynne
Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some downsides: Rollbacks take *forever*, though this usually isn't much of an issue unless you need to abort a really big transaction. It's a good point

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 09:18:55AM +0800, Christopher Kings-Lynne wrote: Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some downsides: Rollbacks take *forever*, though this usually isn't much of an

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some downsides: Rollbacks take *forever*, though this usually isn't much of an issue unless you need to

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Jim C. Nasby
On Wed, Jan 18, 2006 at 08:13:59PM -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: We only need to index the row with the lowest value on any page so the main index would get 100 times smaller. The main part of the index would not need to be written to except when a block

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: Would this open the door for allowing tables to be maintained in CLUSTER order (at least at the block level if not within the blocks)? Though I have no idea how you'd handle page splits without a lot of pain I think the way you'd attack that is by

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Rod Taylor
On Thu, 2006-01-19 at 09:18 +0800, Christopher Kings-Lynne wrote: Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some downsides: Rollbacks take *forever*, though this usually isn't much of an

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Christopher Kings-Lynne [EMAIL PROTECTED] writes: Oracle does, but you pay in other ways. Instead of keeping dead tuples in the main heap, they shuffle them off to an 'undo log'. This has some downsides: Rollbacks take *forever*, though this usually

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Greg Stark
David Scott [EMAIL PROTECTED] writes: Since I am sure everyone is tired of the intro by now, I'll get to the questions: ... Is there any way to modify PostgreSQL to allow index lookups without heap validation that doesn't involve re-writing the MVCC implementation of keeping dead rows on the

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: You pay in Oracle when you read these records too. If there are pending updates you have to do a second read to the rollback segment to get the old record. This hits long-running batch queries especially hard since by the time they finish a large number of

Re: [HACKERS] No heap lookups on index

2006-01-18 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: I wonder if the bitmap can actually be one bit per page actually. Yeah, I think we'd agreed that per-page was the way to go. Per-tuple bitmaps are painful to manage because of the variable number of tuples per page. And really all you need to know is