Re: [PATCHES] Maintaining cluster order on insert

Heikki Linnakangas Tue, 15 May 2007 15:29:05 -0700

Ah, thanks! I had forgotten about it as well.

Bruce Momjian wrote:

[ Sorry I found this one only found recently.]
Your patch has been added to the PostgreSQL unapplied patches list at:

        http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Heikki Linnakangas wrote:
While thinking about index-organized-tables and similar ideas, itoccurred to me that there's some low-hanging-fruit: maintaining clusterorder on inserts by trying to place new heap tuples close to othersimilar tuples. That involves asking the index am where on the heap thenew tuple should go, and trying to insert it there before using the FSM.Using the new fillfactor parameter makes it more likely that there'sroom on the page. We don't worry about the order within the page.
The API I'm thinking of introduces a new optional index am function,amsuggestblock (suggestions for a better name are welcome). It gets thesame parameters as aminsert, and returns the heap block number thatwould be optimal place to put the new tuple. It's be called fromExecInsert before inserting the heap tuple, and the suggestion is passedon to heap_insert and RelationGetBufferForTuple.
I wrote a little patch to implement this for btree, attached.
This could be optimized by changing the existing aminsert API, becauseas it is, an insert will have to descend the btree twice. Once inamsuggestblock and then in aminsert. amsuggestblock could keep the rightindex page pinned so aminsert could locate it quicker. But I wanted tokeep this simple for now. Another improvement might be to allowamsuggestblock to return a list of suggestions, but that makes it moreexpensive to insert if there isn't room in the suggested pages, sinceheap_insert will have to try them all before giving up.
Comments regarding the general idea or the patch? There should probablybe a index option to turn the feature on and off. You'll want to turn itoff when you first load a table, and turn it on after CLUSTER to keep itclustered.
Since there's been discussion on keeping the TODO list more up-to-date,I hereby officially claim the "Automatically maintain clustering on atable" TODO item :). Feel free to bombard me with requests for statusreports. And just to be clear, I'm not trying to sneak this into 8.2anymore, this is 8.3 stuff.
I won't be implementing a background daemon described on the TODO item,since that would essentially be an online version of CLUSTER. Which surewould be nice, but that's a different story.
- Heikki



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [PATCHES] Maintaining cluster order on insert

Reply via email to