Re: [HACKERS] vacuum, performance, and MVCC

2006-06-29 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-06-27 kell 12:16, kirjutas Bruce Momjian: Hannu Krosing wrote: ?hel kenal p?eval, T, 2006-06-27 kell 10:38, kirjutas Hannu Krosing: ?hel kenal p?eval, E, 2006-06-26 kell 23:08, kirjutas Bruce Momjian: Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-29 Thread Bruce Momjian
Hannu Krosing wrote: But we still have to think about similar cases (index entries pointing inside CITC chains), unless we plan to disallow adding indexes to tables. CREATE INDEX has to undo any chains where the new indexed columns change in the chain, and add index entries to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 23:08, kirjutas Bruce Momjian: Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread PFC
My idea is that if an UPDATE places the new tuple on the same page as the old tuple, it will not create new index entries for any indexes where the key doesn't change. Basically the idea behind preventing index bloat by updates is to have one index tuple point to several actual tuples

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-06-27 kell 10:38, kirjutas Hannu Krosing: Ühel kenal päeval, E, 2006-06-26 kell 23:08, kirjutas Bruce Momjian: Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what you are suggesting, that

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 11:31, kirjutas Bruce Momjian: Hannu Krosing wrote: pass 3: clean heap based on ctid from pass 1 If yo do it this way, you dont need to invent new data structures to pass extra info about CITC internals to passes 2 and 3 On more

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Mark Woodward
Ühel kenal päeval, E, 2006-06-26 kell 09:10, kirjutas Mark Woodward: Ãœhel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce Momjian: Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Mark Woodward
On Fri, Jun 23, 2006 at 06:37:01AM -0400, Mark Woodward wrote: While we all know session data is, at best, ephemeral, people still want some sort of persistence, thus, you need a database. For mcache I have a couple plugins that have a wide range of opitions, from read/write at startup and

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Martijn van Oosterhout
On Mon, Jun 26, 2006 at 11:29:27AM -0400, Bruce Momjian wrote: Yes, and for index_getmulti (which doesn't visit the heap at all) we'll have to change all the users of that (which aren't many, I suppose). It's probably worth making a utility function to expand them. I'm still confused

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Bruce Momjian
Hannu Krosing wrote: ?hel kenal p?eval, T, 2006-06-27 kell 10:38, kirjutas Hannu Krosing: ?hel kenal p?eval, E, 2006-06-26 kell 23:08, kirjutas Bruce Momjian: Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Bruce Momjian
PFC wrote: My idea is that if an UPDATE places the new tuple on the same page as the old tuple, it will not create new index entries for any indexes where the key doesn't change. Basically the idea behind preventing index bloat by updates is to have one index tuple point to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Jim C. Nasby
On Mon, Jun 26, 2006 at 11:08:24PM -0400, Bruce Momjian wrote: Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index access

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Bruce Momjian
Martijn van Oosterhout wrote: -- Start of PGP signed section. On Mon, Jun 26, 2006 at 11:29:27AM -0400, Bruce Momjian wrote: Yes, and for index_getmulti (which doesn't visit the heap at all) we'll have to change all the users of that (which aren't many, I suppose). It's probably worth

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Bruce Momjian
Jim C. Nasby wrote: Perhaps my point got lost... in the case where no index keys change during an update, SITC seems superior in every way to my proposal. My idea (let's call it Index Tuple Page Consolidation, ITPC) would be beneficial to UPDATEs that modify one or more index keys but

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Jim C. Nasby
On Tue, Jun 27, 2006 at 10:42:54AM +0200, PFC wrote: Also, I insist (again) that there is a lot to gain by using a bit of compression on the data pages, even if it's very simple compression like storing the new version of a row as a difference from the previous version (ie. only

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Greg Stark
Bruce Momjian [EMAIL PROTECTED] writes: PFC wrote: My idea is that if an UPDATE places the new tuple on the same page as the old tuple, it will not create new index entries for any indexes where the key doesn't change. Basically the idea behind preventing index bloat by

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Bruce Momjian
Greg Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: PFC wrote: My idea is that if an UPDATE places the new tuple on the same page as the old tuple, it will not create new index entries for any indexes where the key doesn't change. Basically the idea behind

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Heikki Linnakangas
On Mon, 26 Jun 2006, Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ... you mean

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ... you mean in the current

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Alvaro Herrera wrote: Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ...

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Zeugswetter Andreas DCP SD
On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ... you mean in the current code base we

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Zeugswetter Andreas DCP SD wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Heikki Linnakangas wrote: On Mon, 26 Jun 2006, Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Mark Woodward
Ühel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce Momjian: Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Martijn van Oosterhout
On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a single-index tuple, and we don't create index entries for the new tuple. Index scan routines will need to be taught

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Mark Woodward
Heikki Linnakangas wrote: On Mon, 26 Jun 2006, Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Martijn van Oosterhout wrote: -- Start of PGP signed section. On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a single-index tuple, and we don't create index

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Bruce Momjian wrote: Martijn van Oosterhout wrote: -- Start of PGP signed section. On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a single-index tuple, and we

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Bruce Momjian wrote: Bruce Momjian wrote: Martijn van Oosterhout wrote: -- Start of PGP signed section. On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 14:56, kirjutas Martijn van Oosterhout: On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a single-index tuple, and we don't

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Hannu Krosing wrote: ?hel kenal p?eval, E, 2006-06-26 kell 14:56, kirjutas Martijn van Oosterhout: On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal UPDATEs, except we set a bit on the old tuple indicating it is a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 09:10, kirjutas Mark Woodward: Ãœhel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce Momjian: Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Martijn van Oosterhout
On Mon, Jun 26, 2006 at 10:50:26AM -0400, Bruce Momjian wrote: I suppose we would also change the index_getmulti() function to return a set of ctids plus flags so the caller knows to follow the chains, right? It is probably better to always return the pointer to the head of CITC

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 10:50, kirjutas Bruce Momjian: Hannu Krosing wrote: ?hel kenal p?eval, E, 2006-06-26 kell 14:56, kirjutas Martijn van Oosterhout: On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: Correct! We use the same pointers used by normal

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-06-26 kell 16:58, kirjutas Martijn van Oosterhout: On Mon, Jun 26, 2006 at 10:50:26AM -0400, Bruce Momjian wrote: I suppose we would also change the index_getmulti() function to return a set of ctids plus flags so the caller knows to follow the chains,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Martijn van Oosterhout wrote: -- Start of PGP signed section. On Mon, Jun 26, 2006 at 10:50:26AM -0400, Bruce Momjian wrote: I suppose we would also change the index_getmulti() function to return a set of ctids plus flags so the caller knows to follow the chains, right? It is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Hannu Krosing wrote: pass 3: clean heap based on ctid from pass 1 If yo do it this way, you dont need to invent new data structures to pass extra info about CITC internals to passes 2 and 3 On more thing - when should free space map be notified about free space in pages with

Re: [HACKERS] vacuum, performance, and MVCC, and compression

2006-06-26 Thread PFC
There were some talks lately about compression. With a bit of lateral thinking I guess this can be used to contain the bloat induced by updates. Of course this is just my hypothesis. Compression in indexes : Instead of storing (value, tuple identifier) keys in the

Re: [HACKERS] vacuum, performance, and MVCC, and compression

2006-06-26 Thread Bruce Momjian
PFC wrote: There were some talks lately about compression. With a bit of lateral thinking I guess this can be used to contain the bloat induced by updates. Of course this is just my hypothesis. Compression in indexes : Instead of storing (value, tuple

Re: [HACKERS] vacuum, performance, and MVCC, and compression

2006-06-26 Thread PFC
What about increasing the size of an existing index entry? Can that be done easily when a new row is added? I'd say it looks pretty much like inserting a new index tuple... Say value is the indexed column. Find first page in the index featuring value. 1 If

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Zeugswetter Andreas DCP SD
head of the chain yet. With an index scan, finding the head is easy, but for a sequential scan, it seems more difficult, and we don't have any free space in the tail of the chain to maintain a pointer to the head. Thinking some more, there will need to be a bit to uniquely identify

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Jim C. Nasby
On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote: If you can't expire the old row because one of the indexed columns was modified, I see no reason to try to reduce the additional index entries. It won't enable early expiration, but it means less work to do on update. If

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index access routines figure out if the index qualifications still hold, but that seems like a lot of overhead. Also, once there is only one visible row in the chain,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Jim C. Nasby
On Fri, Jun 23, 2006 at 06:37:01AM -0400, Mark Woodward wrote: While we all know session data is, at best, ephemeral, people still want some sort of persistence, thus, you need a database. For mcache I have a couple plugins that have a wide range of opitions, from read/write at startup and

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Zeugswetter Andreas DCP SD wrote: head of the chain yet. With an index scan, finding the head is easy, but for a sequential scan, it seems more difficult, and we don't have any free space in the tail of the chain to maintain a pointer to the head. Thinking some more, there

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Jim C. Nasby
On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index access routines figure out if the index qualifications still hold, but that seems like a lot of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Bruce Momjian
Jim C. Nasby wrote: On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index access routines figure out if the index qualifications still hold, but

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Hannu Krosing
Ühel kenal päeval, L, 2006-06-24 kell 19:36, kirjutas Bruce Momjian: Hannu Krosing wrote: ?hel kenal p?eval, R, 2006-06-23 kell 13:08, kirjutas Tom Lane: Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Mark Woodward
On 6/24/2006 9:23 AM, Mark Woodward wrote: On Sat, 24 Jun 2006, Mark Woodward wrote: I'm probably mistaken, but aren't there already forward references in tuples to later versions? If so, I'm only sugesting reversing the order and referencing the latest version. I thought I understood

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Hannu Krosing wrote: Maybe we could start from reusing the index tuples which point to invisible tuples ? The index is not MVCC anyway, so maybe it is easier to do in-place replacement there ? This probably has the same obstacles which have prevented us from removing those in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/25/2006 6:52 AM, Mark Woodward wrote: On 6/24/2006 9:23 AM, Mark Woodward wrote: On Sat, 24 Jun 2006, Mark Woodward wrote: I'm probably mistaken, but aren't there already forward references in tuples to later versions? If so, I'm only sugesting reversing the order and referencing the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/25/2006 12:27 PM, Bruce Momjian wrote: Hannu Krosing wrote: Maybe we could start from reusing the index tuples which point to invisible tuples ? The index is not MVCC anyway, so maybe it is easier to do in-place replacement there ? This probably has the same obstacles which

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/24/2006 4:10 PM, Hannu Krosing wrote: Ühel kenal päeval, L, 2006-06-24 kell 15:44, kirjutas Jan Wieck: That fixes the symptom, not the problem. The problem is performance steadily degrades over time. No, you got it backwards. The performance degradation is the symptom. The

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Heikki Linnakangas
On Sat, 24 Jun 2006, Bruce Momjian wrote: Because having them be on the same page is the only way you can update the page item pointer so when you recycle the row, you the indexes are now pointing to the new version. Pages look like:

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Jan Wieck wrote: Sure, but index reuse seems a lot easier, as there is nothing additional to remember or clean out when doing it. Yes, seems so. TODO added: * Reuse index tuples that point to heap tuples that are not visible to anyone? When reusing a heap tuple you

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Heikki Linnakangas wrote: On Sat, 24 Jun 2006, Bruce Momjian wrote: Because having them be on the same page is the only way you can update the page item pointer so when you recycle the row, you the indexes are now pointing to the new version. Pages look like:

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/25/2006 2:24 PM, Bruce Momjian wrote: Jan Wieck wrote: Sure, but index reuse seems a lot easier, as there is nothing additional to remember or clean out when doing it. Yes, seems so. TODO added: * Reuse index tuples that point to heap tuples that are not visible to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-06-25 kell 14:24, kirjutas Bruce Momjian: Jan Wieck wrote: Sure, but index reuse seems a lot easier, as there is nothing additional to remember or clean out when doing it. Yes, seems so. TODO added: * Reuse index tuples that point to heap tuples

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-06-25 kell 06:52, kirjutas Mark Woodward: I'm not sure why vacuum can't run similarly to the way it does now. What do you mean ? Currently vacuum runs a three-step process 1) runs a full scan over heap and collects all dead tuple ctids from heap 2) run full scans

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Jan Wieck wrote: An update that results in all the same values of every indexed column of a known deleted invisible tuple. This reused tuple can by definition not be the one currently updated. So unless it is a table without a primary key, this assumes that at least 3 versions of the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Hannu Krosing wrote: ?hel kenal p?eval, P, 2006-06-25 kell 14:24, kirjutas Bruce Momjian: Jan Wieck wrote: Sure, but index reuse seems a lot easier, as there is nothing additional to remember or clean out when doing it. Yes, seems so. TODO added: * Reuse

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
bruce wrote: Why three? I explained using only two heap tuples: [item1]...[tuple1] becomes on UPDATE: -- [item1]...[tuple1][tuple2] - on another UPDATE, if tuple1 is no longer visible: --

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/25/2006 5:18 PM, Bruce Momjian wrote: Jan Wieck wrote: An update that results in all the same values of every indexed column of a known deleted invisible tuple. This reused tuple can by definition not be the one currently updated. So unless it is a table without a primary key, this

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Bruce Momjian
Jan Wieck wrote: [item1]...[tuple1] becomes on UPDATE: -- [item1]...[tuple1][tuple2] - on another UPDATE, if tuple1 is no longer visible: -- [item1]...[tuple1][tuple2]

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Jan Wieck
On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ... you mean in the current code base we have heap

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Alvaro Herrera
Jan Wieck wrote: On 6/25/2006 10:12 PM, Bruce Momjian wrote: When you are using the update chaining, you can't mark that index row as dead because it actually points to more than one row on the page, some are non-visible, some are visible. Back up the truck ... you mean in the current code

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jan Wieck
On 6/23/2006 3:10 PM, Mark Woodward wrote: This is NOT an in-place update. The whole MVCC strategy of keeping old versions around doesn't change. The only thing that does change is one level of indirection. Rather than keep references to all versions of all rows in indexes, keep only a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jan Wieck
On 6/23/2006 9:56 PM, [EMAIL PROTECTED] wrote: On Fri, Jun 23, 2006 at 03:08:34PM -0400, Bruce Momjian wrote: Tom Lane wrote: ... suggesting. We're having a hard enough time debugging and optimizing *one* storage model. I think the correct path forward is to stick with the same basic

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread mark
On Sat, Jun 24, 2006 at 03:29:47AM -0400, Jan Wieck wrote: It sounds like everybody agrees that things need to be fixed, and genuinely motivated people are trying to offer what they have to the table. One singe core team member responds vaguely in a way, you feel being supportive of your

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Heikki Linnakangas
On Fri, 23 Jun 2006, Jonah H. Harris wrote: On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: Rather than keep references to all versions of all rows in indexes, keep only a reference to the first or key row of each row, and have the first version of a row form the head of a linked list to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
On 6/23/2006 3:10 PM, Mark Woodward wrote: This is NOT an in-place update. The whole MVCC strategy of keeping old versions around doesn't change. The only thing that does change is one level of indirection. Rather than keep references to all versions of all rows in indexes, keep only a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Martijn van Oosterhout
On Sat, Jun 24, 2006 at 08:14:10AM -0400, Mark Woodward wrote: On 6/23/2006 3:10 PM, Mark Woodward wrote: This is NOT an in-place update. The whole MVCC strategy of keeping old versions around doesn't change. The only thing that does change is one level of indirection. Rather than keep

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 03:10:39PM -0400, Mark Woodward wrote: This is NOT an in-place update. The whole MVCC strategy of keeping old versions around doesn't change. The only thing that does change is one level of indirection. Rather than keep references to all versions of all rows in indexes,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Heikki Linnakangas
On Sat, 24 Jun 2006, Mark Woodward wrote: I'm probably mistaken, but aren't there already forward references in tuples to later versions? If so, I'm only sugesting reversing the order and referencing the latest version. I thought I understood your idea, but now you lost me again. I thought

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
On Sat, 24 Jun 2006, Mark Woodward wrote: I'm probably mistaken, but aren't there already forward references in tuples to later versions? If so, I'm only sugesting reversing the order and referencing the latest version. I thought I understood your idea, but now you lost me again. I thought

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jonah H. Harris
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: Currently it looks like this: ver001-ver002-ver003-...-verN That's what t_ctid does now, right? Well, that's sort of stupid. Why not have it do this: ver001-verN-...-ver003-ver002-| Heh, because that's crazy. The first time you insert a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jonah H. Harris
On 6/24/06, Jonah H. Harris [EMAIL PROTECTED] wrote: Grr... need coffee... s/c_tid/ctid/g -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: Currently it looks like this: ver001-ver002-ver003-...-verN That's what t_ctid does now, right? Well, that's sort of stupid. Why not have it do this: ver001-verN-...-ver003-ver002-| Heh, because that's crazy. The first time you insert

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jochem van Dieten
On 6/24/06, Mark Woodward wrote: ver001-verN-...-ver003-ver002-| ^-/ This will speed up almost *all* queries when there are more than two version of rows. OK, here is the behavior of an update: (1) Find the latest version of the row (2) Duplicate row and

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Martijn van Oosterhout
On Sat, Jun 24, 2006 at 09:23:28AM -0400, Mark Woodward wrote: Can you try to explain more carefully how the whole thing would work? What would an index tuple point to? What pointers would a heap tuple have? What would an index scan do to find the row version it's interested in? What

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jonah H. Harris
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: In the scenario, as previously outlined: ver001-verN-...-ver003-ver2-| ^-/ So you want to always keep an old version around? -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: In the scenario, as previously outlined: ver001-verN-...-ver003-ver2-| ^-/ So you want to always keep an old version around? Prior to vacuum, it will be there anyway, and after vacuum, the new version will

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jonah H. Harris
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: In the scenario, as previously outlined: ver001-verN-...-ver003-ver2-| ^-/ So you want to always keep an old version around? Prior to vacuum, it will be

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread PFC
What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every technique has its downsides and complications. Sure ;) MVCC generates dead rows, by its very nature ; however I see two trends in this :

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Bruce Momjian
bruce wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: In the scenario, as previously outlined: ver001-verN-...-ver003-ver2-| ^-/ So you want to always keep an old version around? Prior to vacuum, it will

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Heikki Linnakangas
On Sat, 24 Jun 2006, Bruce Momjian wrote: OK, I have an idea. Right now, an UPDATE where the old and new rows are on the same page have two index entries. What if we made only one index entry for both? We already have UPDATE chaining, where the old row points to the new one. If no key

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jan Wieck
On 6/24/2006 9:23 AM, Mark Woodward wrote: On Sat, 24 Jun 2006, Mark Woodward wrote: I'm probably mistaken, but aren't there already forward references in tuples to later versions? If so, I'm only sugesting reversing the order and referencing the latest version. I thought I understood your

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 13:08, kirjutas Tom Lane: Csaba Nagy [EMAIL PROTECTED] writes: Surprisingly its mostly WAL traffic, the heap/index pages themselves are often not yet synced to disk by time of vacuum, so no additional traffic there. If you had made 5 updates per page

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce Momjian: Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Jan Wieck
On 6/22/2006 2:37 PM, Alvaro Herrera wrote: Adding back pgsql-hackers. Mark Woodward wrote: Mark Woodward wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is that

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Hannu Krosing
Ühel kenal päeval, L, 2006-06-24 kell 15:44, kirjutas Jan Wieck: That fixes the symptom, not the problem. The problem is performance steadily degrades over time. No, you got it backwards. The performance degradation is the symptom. The problem is that there are too many dead tuples in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Martijn van Oosterhout
On Sat, Jun 24, 2006 at 10:04:43PM +0300, Hannu Krosing wrote: Maybe we could start from reusing the index tuples which point to invisible tuples ? The index is not MVCC anyway, so maybe it is easier to do in-place replacement there ? This probably has the same obstacles which have prevented

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Christopher Browne
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Jan Wieck) wrote: On 6/22/2006 2:37 PM, Alvaro Herrera wrote: Adding back pgsql-hackers. Mark Woodward wrote: Mark Woodward wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Bruce Momjian
Heikki Linnakangas wrote: On Sat, 24 Jun 2006, Bruce Momjian wrote: OK, I have an idea. Right now, an UPDATE where the old and new rows are on the same page have two index entries. What if we made only one index entry for both? We already have UPDATE chaining, where the old row

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Alvaro Herrera
Mark Woodward wrote: The update behavior of PostgreSQL is probably the *last* serious issue. Debate all you want, vacuum mitigates the problem to varying levels, fixing the problem will be a huge win. If the update behavior gets fixed, I can't think of a single issue with postgresql that

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Bruce Momjian
Hannu Krosing wrote: ?hel kenal p?eval, R, 2006-06-23 kell 13:08, kirjutas Tom Lane: Csaba Nagy [EMAIL PROTECTED] writes: Surprisingly its mostly WAL traffic, the heap/index pages themselves are often not yet synced to disk by time of vacuum, so no additional traffic there. If you had

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Good advice, except if the table is huge :-) ... Then the table shouldn't be designed to be huge. That represents a design error. [snip] This demonstrates that archival material and active data should be kept separately. They have different access patterns; kludging them into the same

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread PFC
Well, then please help me find a better design cause I can't see one... what we have here is a big membership table of email lists. When there's a sendout then the memberships of the affected group are heavily read/updated, otherwise they are idle. None of the memberships is archive data, they

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
I suppose you have a table memberships (user_id, group_id) or something like it ; it should have as few columns as possible ; then try regularly clustering on group_id (maybe once a week) so that all the records for a particular group are close together. Getting the members of a

  1   2   >