Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas
Heikki Linnakangas wrote: If we want to keep the property that VACUUM doesn't re-evaluate index entries, any proposal that doesn't keep track of every heap tuple isn't going to work. I'll try to modify the design I had in mind so that it does keep track of every heap tuple in some form. After

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Martijn van Oosterhout
On Fri, Sep 29, 2006 at 10:51:32AM +0100, Heikki Linnakangas wrote: After some thought: Imagine a normal B-tree just like what we have now. But when there is more than one tuple on the same heap page with consecutive index keys, we represent all of them in a single index tuple that

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas
Martijn van Oosterhout wrote: On Fri, Sep 29, 2006 at 10:51:32AM +0100, Heikki Linnakangas wrote: After some thought: Imagine a normal B-tree just like what we have now. But when there is more than one tuple on the same heap page with consecutive index keys, we represent all of them in a

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Simon Riggs
On Fri, 2006-09-29 at 10:51 +0100, Heikki Linnakangas wrote: Heikki Linnakangas wrote: If we want to keep the property that VACUUM doesn't re-evaluate index entries, any proposal that doesn't keep track of every heap tuple isn't going to work. I'll try to modify the design I had in mind

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Gregory Stark
Csaba Nagy [EMAIL PROTECTED] writes: I think you build a whole new index named something like .temp-reindex and then as the last step of the second transaction delete the old idnex and rename the new index. That would require getting exclusive lock on the table. Just out of

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas
Simon Riggs wrote: On Fri, 2006-09-29 at 10:51 +0100, Heikki Linnakangas wrote: Heikki Linnakangas wrote: If we want to keep the property that VACUUM doesn't re-evaluate index entries, any proposal that doesn't keep track of every heap tuple isn't going to work. I'll try to modify the design I

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Imagine a normal B-tree just like what we have now. But when there is more than one tuple on the same heap page with consecutive index keys, we represent all of them in a single index tuple that contains the key of the first one of them, and a

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I'm not very interested in the case where you have a lot of equal keys, I think the bitmap index am is more suitable for that. that indeed you meant to write consecutive, and I've got a problem with that: define consecutive.

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Jan de Visser
On Friday 29 September 2006 10:55, Heikki Linnakangas wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I'm not very interested in the case where you have a lot of equal keys, I think the bitmap index am is more suitable for that. that indeed you meant to write

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: define consecutive. I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in A and B are consecutive in the index, if there's no value X in the index so that A X B. Maybe there's a better word for that. Um, but

Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in A and B are consecutive in the index, if there's no value X in the index so that A X B. Maybe there's a better word for that. Um, but how are you going

Re: [HACKERS] Block B-Tree concept

2006-09-28 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: AFAICS, we could disable the optimization and use a full-blown transaction when vacuuming a table with a functional block index. No, we couldn't, or at least it's not merely a matter of reversing a recent optimization. The

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Heikki Linnakangas
Jim C. Nasby wrote: Do I understand that to mean that you can no longer lazy vacuum a functional index? With the normal B-trees we have now, there's no problem vacuuming a functional index. But it would be a problem with the Block B-tree, because the proposed way of vacuuming a Block B-tree

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Heikki Linnakangas
Jim C. Nasby wrote: Couldn't vacuum just eliminate tuples marked dead? Heck, don't we do that anyway right now? You mean _index_ tuples marked dead? Sure, no problem there. Granted, you'd want to periodically ensure that you scan the entire index, but that shouldn't be horribly hard to set

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Bruce Momjian
Heikki Linnakangas wrote: Jim C. Nasby wrote: Couldn't vacuum just eliminate tuples marked dead? Heck, don't we do that anyway right now? You mean _index_ tuples marked dead? Sure, no problem there. Granted, you'd want to periodically ensure that you scan the entire index, but that

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Heikki Linnakangas
Bruce Momjian wrote: Heikki Linnakangas wrote: Jim C. Nasby wrote: Granted, you'd want to periodically ensure that you scan the entire index, but that shouldn't be horribly hard to set up. Well, it seems to be. A vacuum can't evaluate index expressions because it's not in a real transaction.

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: AFAICS, we could disable the optimization and use a full-blown transaction when vacuuming a table with a functional block index. No, we couldn't, or at least it's not merely a matter of reversing a recent optimization. The fundamental issue with

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Heikki Linnakangas [EMAIL PROTECTED] writes: Also, now that we have concurrent CREATE INDEX, we could implement concurrent REINDEX as well, I believe. That's probably more easily said than done --- in particular, I don't understand what the committed

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: That's probably more easily said than done --- in particular, I don't understand what the committed state after the first transaction would look like. I think you build a whole new index named something like

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Csaba Nagy
I think you build a whole new index named something like .temp-reindex and then as the last step of the second transaction delete the old idnex and rename the new index. That would require getting exclusive lock on the table. Just out of curiosity, creating a new index concurrently (or

Re: [HACKERS] Block B-Tree concept

2006-09-27 Thread Jim C. Nasby
On Wed, Sep 27, 2006 at 05:38:38AM -0400, Bruce Momjian wrote: Heikki Linnakangas wrote: Jim C. Nasby wrote: Couldn't vacuum just eliminate tuples marked dead? Heck, don't we do that anyway right now? You mean _index_ tuples marked dead? Sure, no problem there. Granted, you'd

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Heikki Linnakangas
Teodor Sigaev wrote: Right now, if an index entry points to a dead tuple, we set a bit in the index so future lookups do not access the heap. We could set a bit for block index entries that point to a page that has no live rows, and have vacuum remove the index entry later. GIN don't

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Bruce Momjian
Heikki Linnakangas wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I've been experimenting with the idea of a so-called Block B-Tree. The basic idea is that instead of storing an index tuple for each heap tuple, we store an index tuple for each heap block. This

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Bruce Momjian
Teodor Sigaev wrote: Right now, if an index entry points to a dead tuple, we set a bit in the index so future lookups do not access the heap. We could set a bit for block index entries that point to a page that has no live rows, and have vacuum remove the index entry later. GIN don't

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I've been experimenting with the idea of a so-called Block B-Tree. The basic idea is that instead of storing an index tuple for each heap tuple, we store an index tuple for each heap block. This dramatically reduces the size of

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: VACUUM? There's a few options that I've thought of this far: 1. Whenever a tuple is found dead on page X, vacuum of the index will have to go to that page again to see if there's any matching tuples left. Anything that involves

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: I've been experimenting with the idea of a so-called Block B-Tree. The basic idea is that instead of storing an index tuple for each heap tuple, we store an index tuple for each heap block. This dramatically reduces the size of an index, leading

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Csaba Nagy
And we're back to routine REINDEX I guess :-(. This doesn't seem like a satisfactory answer. If the reindex works online, it could be a satisfactory solution. Cheers, Csaba. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet,

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Teodor Sigaev
Right now, if an index entry points to a dead tuple, we set a bit in the index so future lookups do not access the heap. We could set a bit for block index entries that point to a page that has no live rows, and have vacuum remove the index entry later. GIN don't support this feature... --

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Heikki Linnakangas
Tom Lane wrote: Anything that involves having VACUUM re-evaluate index expressions is a nonstarter ... or have you already forgotten the optimizations we put into 8.2 that assume, eg, no sub-transactions within a VACUUM? Umm, I'm afraid I have. Could you give me a clue? 3. Do nothing. Let

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Tom Lane wrote: Anything that involves having VACUUM re-evaluate index expressions is a nonstarter ... or have you already forgotten the optimizations we put into 8.2 that assume, eg, no sub-transactions within a VACUUM? Umm, I'm afraid I have. Could you give me a

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Heikki Linnakangas wrote: Tom Lane wrote: Anything that involves having VACUUM re-evaluate index expressions is a nonstarter ... or have you already forgotten the optimizations we put into 8.2 that assume, eg, no sub-transactions within a VACUUM?

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Also, now that we have concurrent CREATE INDEX, we could implement concurrent REINDEX as well, I believe. That's probably more easily said than done --- in particular, I don't understand what the committed state after the first transaction would

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Jim C. Nasby
On Tue, Sep 26, 2006 at 11:16:54AM +0100, Heikki Linnakangas wrote: To locate the actual matching items on the heap page, we have to scan the heap page because we don't have the item ids, so this is a tradeoff between CPU and I/O. However, we could have a hybrid where we initially store

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Jim C. Nasby
On Tue, Sep 26, 2006 at 05:27:56PM +0100, Heikki Linnakangas wrote: Heikki Linnakangas wrote: Tom Lane wrote: Anything that involves having VACUUM re-evaluate index expressions is a nonstarter ... or have you already forgotten the optimizations we put into 8.2 that assume, eg, no

Re: [HACKERS] Block B-Tree concept

2006-09-26 Thread Jim C. Nasby
On Tue, Sep 26, 2006 at 08:51:10AM -0400, Tom Lane wrote: 3. Do nothing. Let index scans mark the index tuple as dead when it's convenient. There's no correctness problem with just leaving dead index tuples there, because you have to check the index quals on each heap tuple anyway when