Re: [HACKERS] GiST insert algorithm rewrite

2011-01-09 Thread Heikki Linnakangas
On 09.01.2011 07:05, Tom Lane wrote: I just found out that the benchmark test script in contrib/intarray/bench/ crashes HEAD in gistdoinsert() --- it looks like it's trying to pop to a stack entry that isn't there. Run it per the instructions in the intarray documentation: $ createdb TEST $

Re: [HACKERS] GiST insert algorithm rewrite

2011-01-08 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21.12.2010 20:00, Heikki Linnakangas wrote: One final version, with a bug fix wrt. root page split and some cleanup. I'm planning to commit this before Christmas. It's a big patch, so review would be much appreciated.

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-23 Thread Heikki Linnakangas
On 21.12.2010 20:00, Heikki Linnakangas wrote: One final version, with a bug fix wrt. root page split and some cleanup. I'm planning to commit this before Christmas. It's a big patch, so review would be much appreciated. Committed. Phew! Review testing is of course still appreciated, given

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Robert Haas
On Mon, Dec 13, 2010 at 7:09 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Attached is an updated patch, but that issue with limited number of backup blocks needs to be resolved. The straightforward way would be to change the WAL format to increase the limit. Eh, is that

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Heikki Linnakangas
On 13.12.2010 15:04, Robert Haas wrote: On Mon, Dec 13, 2010 at 7:09 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Attached is an updated patch, but that issue with limited number of backup blocks needs to be resolved. The straightforward way would be to change the WAL

Increasing max # of backup blocks (was Re: [HACKERS] GiST insert algorithm rewrite)

2010-12-13 Thread Heikki Linnakangas
On 13.12.2010 15:20, Heikki Linnakangas wrote: On 13.12.2010 15:04, Robert Haas wrote: On Mon, Dec 13, 2010 at 7:09 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Attached is an updated patch, but that issue with limited number of backup blocks needs to be resolved. The

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: But that creates a new problem: There's a maximum of three backup blocks per WAL record, but a GiST page can be split into any number of child pages as one operation. You might run out of backup block slots. Attached is an

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Greg Stark
On Mon, Dec 13, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think you need to refactor the operation so that there's one WAL record per child page, or something along that line.  I concede this might be diffcult :-( If it's only the backup blocks that matter couldn't you generate

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Heikki Linnakangas
On 13.12.2010 19:19, Greg Stark wrote: On Mon, Dec 13, 2010 at 3:14 PM, Tom Lanet...@sss.pgh.pa.us wrote: I think you need to refactor the operation so that there's one WAL record per child page, or something along that line. I concede this might be diffcult :-( If it's only the backup

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 13.12.2010 19:19, Greg Stark wrote: If it's only the backup blocks that matter couldn't you generate noop WAL records with just the full page image in them. Once all those are generated then generate the actual split operation

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Heikki Linnakangas
On 13.12.2010 19:48, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 13.12.2010 19:19, Greg Stark wrote: If it's only the backup blocks that matter couldn't you generate noop WAL records with just the full page image in them. Once all those are generated then

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 13.12.2010 19:48, Tom Lane wrote: Yeah. Wouldn't the original page-split record have been carrying full page images already? Yes. BTW, the original split record doesn't run into the limit because it doesn't use the

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-13 Thread Heikki Linnakangas
On 13.12.2010 20:30, Tom Lane wrote: Can we fix it so that each child page is updated, and its downlink inserted, as a separate atomic action? That'd require each intermediate state to be consistent and crash-safe, but I think you really need the intermediate states to be consistent anyway

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-03 Thread Robert Haas
On Dec 3, 2010, at 4:54 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Here's an updated patch. How carefully have you perf-tested this? On closer look, supporting the invalid tuples in scans was trivial, so I kept that after all. So you can still query an index with

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 04:10, Robert Haas wrote: On Tue, Nov 30, 2010 at 10:26 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Does the current code cope with the corruption? It's not corruption, but intended degradation. Yes, the current code copes with it, that's how GiST survives

Re: [HACKERS] GiST insert algorithm rewrite

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 4:00 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 01.12.2010 04:10, Robert Haas wrote: On Tue, Nov 30, 2010 at 10:26 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com  wrote: Does the current code cope with the corruption? It's not

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Heikki Linnakangas
On 27.11.2010 21:31, Bruce Momjian wrote: Heikki Linnakangas wrote: There's no on-disk format changes, except for the additional flag in the page headers, so this does not affect pg_upgrade. However, if there's any invalid keys in the old index because of an incomplete insertion, the new code

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 11:55, Heikki Linnakangas wrote: On 27.11.2010 21:31, Bruce Momjian wrote: Heikki Linnakangas wrote: There's no on-disk format changes, except for the additional flag in the page headers, so this does not affect pg_upgrade. However, if there's any invalid keys in the old index

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 5:02 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 30.11.2010 11:55, Heikki Linnakangas wrote: On 27.11.2010 21:31, Bruce Momjian wrote: Heikki Linnakangas wrote: There's no on-disk format changes, except for the additional flag in the page

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 16:23, Robert Haas wrote: On Tue, Nov 30, 2010 at 5:02 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 30.11.2010 11:55, Heikki Linnakangas wrote: On 27.11.2010 21:31, Bruce Momjian wrote: Heikki Linnakangas wrote: There's no on-disk format changes,

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Bruce Momjian
Heikki Linnakangas wrote: Does the current code cope with the corruption? It's not corruption, but intended degradation. Yes, the current code copes with it, that's how GiST survives a crash. However, even with the current code, VACUUM will nag if it finds any invalid tuples with this

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 10:26 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Does the current code cope with the corruption? It's not corruption, but intended degradation. Yes, the current code copes with it, that's how GiST survives a crash. However, even with the current

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-27 Thread Bruce Momjian
Heikki Linnakangas wrote: There's no on-disk format changes, except for the additional flag in the page headers, so this does not affect pg_upgrade. However, if there's any invalid keys in the old index because of an incomplete insertion, the new code will not understand that. So you should

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-18 Thread Heikki Linnakangas
On 17.11.2010 19:36, Teodor Sigaev wrote: Hmm, will have to do some benchmarking on that. I'm using the Consistent function when walking down to check if the downlink needs to be updated, and assumed that it would be insignificant compared to the cost of calling Penalty on all the keys on the

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Teodor Sigaev
Hmm, will have to do some benchmarking on that. I'm using the Consistent function when walking down to check if the downlink needs to be updated, and assumed that it would be insignificant compared to the cost of calling Penalty on all the keys on the page. Why consistent?! It's impossible - you

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Teodor Sigaev
Sorry, I missed beginning of discussion on GiST, so I read it on the web mail archive. You wrote: http://archives.postgresql.org/pgsql-hackers/2010-11/msg00939.php [skip] 0. (the child page is locked) 1. The parent page is locked. 2. The child page is split. The original page becomes the left

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Heikki Linnakangas
On 17.11.2010 19:46, Teodor Sigaev wrote: I disagree with that opinion: if we crash between 2 and 3 then why will somebody update parent before WAL replay? WAL replay process in this case should complete child split by inserting invalid pointer and tree become correct again, although it needs to

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: There are two key changes to the insert algorithm: 1. When we walk down the tree searching for a suitable leaf page to insert the new tuple to, we update the nodes on the way down so that they are consistent with the new key

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Heikki Linnakangas
On 16.11.2010 20:01, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: 2. When a page is split, we mark the new left page with a flag to indicate that the downlink for the page to the right hasn't been inserted yet. When the downlink is inserted, the flag is

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Robert Haas
On Tue, Nov 16, 2010 at 1:22 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: BTW, I don't try to fix incomplete splits during vacuum in the patch. That's perhaps a bit surprising, and probably would be easy to add, but I left it out for now as it's not strictly necessary.

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Heikki Linnakangas
On 16.11.2010 20:46, Robert Haas wrote: On Tue, Nov 16, 2010 at 1:22 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: BTW, I don't try to fix incomplete splits during vacuum in the patch. That's perhaps a bit surprising, and probably would be easy to add, but I left it out for

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Robert Haas
On Tue, Nov 16, 2010 at 1:50 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: If we start to enlarge the bounding boxes on the higher levels of the tree and then crash before inserting the key, is there any mechanism for getting them back down to the minimal size? No. There's

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Teodor Sigaev
they are consistent with the new key we're inserting. The old GiST algorithm adjusted all the parent pages only after inserting the tuple on the leaf. Updating them on the way down ensures that the tree is Hm, performance? while you traverse to leaf page, on each inner page you will need to call

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Oh. So do the indexes just degrade over time until they eventually need to be REINDEX'd? At some point you might reach a state where a reindex would be helpful. But the same is true of btrees. I don't think this is a serious objection, at least not

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Heikki Linnakangas
On 16.11.2010 21:20, Teodor Sigaev wrote: they are consistent with the new key we're inserting. The old GiST algorithm adjusted all the parent pages only after inserting the tuple on the leaf. Updating them on the way down ensures that the tree is Hm, performance? while you traverse to leaf

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-16 Thread Robert Haas
On Tue, Nov 16, 2010 at 3:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Oh.  So do the indexes just degrade over time until they eventually need to be REINDEX'd? At some point you might reach a state where a reindex would be helpful. But the same is true