During recent tuning of the TPC-C workload, I produced the following patch to force COPY to leave some space in each data block when it loads data into heap relations.
Rationale for this was to improve performance of random UPDATE statements against a table too large to fit in memory. That activity is joint-equal most frequent single statement type. [tpcc.Stock table, for those who know the workload]. Each UPDATE tries to insert a new row version. The blocks are all full, so each UPDATE causes this activity: read block, decide cannot insert, unlock block extend relation by 1 block relock first block, lock new block insert row into new block After the patch, most UPDATEs cause only a single block access/update, excluding the index effects. This patch is essentially the "bottom-half" of what is needed to make FILLFACTOR work for heap relations. There is no syntax/command statements coded at this time...a recoded executable was used to reload the STOCK table before testing with a fixed padding of 1000 bytes, or roughly FILLFACTOR 88. Index relations might also be added, though it was less clear to me at the time I wrote the patch that such an approach would bring performance improvement. The main discussion now is probably where the FILLFACTOR should be specified. Should it be: a) COPY table FROM .... WITH FILLFACTOR 90 which would fill datablocks for *this copy only* at 90% full b) ALTER TABLE table SET FILLFACTOR 90 which would set the table so that all COPY commands used to load data into it would load heap blocks at 90% full Implementing an extension to the COPY command seemed easier than trying to fiddle with the system catalogs to allow this extra factor be recorded for ever, when it is only used in conjunction with a COPY statement anyway. The patch was produced quickly to assist tuning efforts during Scalability & Performance benchmarking of PostgreSQL 8.0 carried out at Unisys Corporation's Mission Viejo engineering laboratory. The development was sponsored by Unisys Corporation and the patch has now been donated to the PostgreSQL community under the standard PostgreSQL/BSD licence. Approval for release of this code has been given in writing to me by the Director, Open Runtime Products, Unisys on April 8, 2005. If anybody would like to complete the work started here, I'd be most grateful - there are some other items I need to work on for 8.1 ahead of this now. Best Regards, Simon Riggs
diff -c -r postgresql-8.0.1/src/backend/access/heap/heapam.c postgresql-8.0.1-wip/src/backend/access/heap/heapam.c *** postgresql-8.0.1/src/backend/access/heap/heapam.c 2005-02-03 10:38:06.000000000 -0800 --- postgresql-8.0.1-wip/src/backend/access/heap/heapam.c 2005-02-10 04:50:33.832304712 -0800 *************** *** 1101,1107 **** * command ID. */ Oid ! heap_insert(Relation relation, HeapTuple tup, CommandId cid) { TransactionId xid = GetCurrentTransactionId(); Buffer buffer; --- 1101,1107 ---- * command ID. */ Oid ! heap_insert(Relation relation, HeapTuple tup, CommandId cid, Size requiredFreespace) { TransactionId xid = GetCurrentTransactionId(); Buffer buffer; *************** *** 1151,1157 **** heap_tuple_toast_attrs(relation, tup, NULL); /* Find buffer to insert this tuple into */ ! buffer = RelationGetBufferForTuple(relation, tup->t_len, InvalidBuffer); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); --- 1151,1157 ---- heap_tuple_toast_attrs(relation, tup, NULL); /* Find buffer to insert this tuple into */ ! buffer = RelationGetBufferForTuple(relation, tup->t_len, InvalidBuffer, requiredFreespace); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); *************** *** 1242,1255 **** * Currently, this routine differs from heap_insert only in supplying * a default command ID. But it should be used rather than using * heap_insert directly in most places where we are modifying system catalogs. */ Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId()); } /* * heap_delete - delete a tuple * * NB: do not call this directly unless you are prepared to deal with --- 1242,1273 ---- * Currently, this routine differs from heap_insert only in supplying * a default command ID. But it should be used rather than using * heap_insert directly in most places where we are modifying system catalogs. + * + * Leave no freespace in block when inserting... */ Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId(), zeroFreespace); } /* + * partialfill_heap_insert - insert a tuple + * + * Similar to simple_heap_insert, except for... + * + * Caller decides how much freespace to leave in page... + */ + Oid + partialfill_heap_insert(Relation relation, HeapTuple tup, Size requiredFreespace) + { + return heap_insert(relation, tup, GetCurrentCommandId(), requiredFreespace); + } + + + /* * heap_delete - delete a tuple * * NB: do not call this directly unless you are prepared to deal with *************** *** 1672,1678 **** { /* Assume there's no chance to put newtup on same page. */ newbuf = RelationGetBufferForTuple(relation, newtup->t_len, ! buffer); } else { --- 1690,1696 ---- { /* Assume there's no chance to put newtup on same page. */ newbuf = RelationGetBufferForTuple(relation, newtup->t_len, ! buffer, zeroFreespace); } else { *************** *** 1689,1695 **** */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, newtup->t_len, ! buffer); } else { --- 1707,1713 ---- */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, newtup->t_len, ! buffer, zeroFreespace); } else { diff -c -r postgresql-8.0.1/src/backend/access/heap/hio.c postgresql-8.0.1-wip/src/backend/access/heap/hio.c *** postgresql-8.0.1/src/backend/access/heap/hio.c 2005-02-03 10:38:06.000000000 -0800 --- postgresql-8.0.1-wip/src/backend/access/heap/hio.c 2005-02-10 03:54:57.000000000 -0800 *************** *** 89,95 **** */ Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer) { Buffer buffer = InvalidBuffer; Page pageHeader; --- 89,95 ---- */ Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, Size requiredFreespace) { Buffer buffer = InvalidBuffer; Page pageHeader; *************** *** 192,198 **** */ pageHeader = (Page) BufferGetPage(buffer); pageFreeSpace = PageGetFreeSpace(pageHeader); ! if (len <= pageFreeSpace) { /* use this page as future insert target, too */ relation->rd_targblock = targetBlock; --- 192,198 ---- */ pageHeader = (Page) BufferGetPage(buffer); pageFreeSpace = PageGetFreeSpace(pageHeader); ! if (len <= (pageFreeSpace - requiredFreespace)) { /* use this page as future insert target, too */ relation->rd_targblock = targetBlock; diff -c -r postgresql-8.0.1/src/backend/commands/copy.c postgresql-8.0.1-wip/src/backend/commands/copy.c *** postgresql-8.0.1/src/backend/commands/copy.c 2005-02-03 10:38:07.000000000 -0800 --- postgresql-8.0.1-wip/src/backend/commands/copy.c 2005-02-10 05:23:04.416770800 -0800 *************** *** 1953,1959 **** /* * OK, store the tuple and create index entries for it */ ! simple_heap_insert(rel, tuple); if (resultRelInfo->ri_NumIndices > 0) ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false); --- 1955,1961 ---- /* * OK, store the tuple and create index entries for it */ ! partialfill_heap_insert(rel, tuple, copyFreespace); if (resultRelInfo->ri_NumIndices > 0) ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false); diff -c -r postgresql-8.0.1/src/backend/executor/execMain.c postgresql-8.0.1-wip/src/backend/executor/execMain.c *** postgresql-8.0.1/src/backend/executor/execMain.c 2005-02-03 10:38:07.000000000 -0800 --- postgresql-8.0.1-wip/src/backend/executor/execMain.c 2005-02-10 04:58:01.200294440 -0800 *************** *** 1293,1299 **** if (estate->es_into_relation_descriptor != NULL) { heap_insert(estate->es_into_relation_descriptor, tuple, ! estate->es_snapshot->curcid); IncrAppended(); } --- 1293,1299 ---- if (estate->es_into_relation_descriptor != NULL) { heap_insert(estate->es_into_relation_descriptor, tuple, ! estate->es_snapshot->curcid, zeroFreespace); IncrAppended(); } *************** *** 1369,1375 **** * insert the tuple */ newId = heap_insert(resultRelationDesc, tuple, ! estate->es_snapshot->curcid); IncrAppended(); (estate->es_processed)++; --- 1369,1375 ---- * insert the tuple */ newId = heap_insert(resultRelationDesc, tuple, ! estate->es_snapshot->curcid, zeroFreespace); IncrAppended(); (estate->es_processed)++; diff -c -r postgresql-8.0.1/src/include/access/heapam.h postgresql-8.0.1-wip/src/include/access/heapam.h *** postgresql-8.0.1/src/include/access/heapam.h 2005-02-03 10:38:12.000000000 -0800 --- postgresql-8.0.1-wip/src/include/access/heapam.h 2005-02-10 04:48:20.998498520 -0800 *************** *** 158,164 **** ItemPointer tid); extern void setLastTid(const ItemPointer tid); ! extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid); extern int heap_delete(Relation relation, ItemPointer tid, ItemPointer ctid, CommandId cid, Snapshot crosscheck, bool wait); extern int heap_update(Relation relation, ItemPointer otid, HeapTuple tup, --- 158,164 ---- ItemPointer tid); extern void setLastTid(const ItemPointer tid); ! extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, Size requiredFreespace); extern int heap_delete(Relation relation, ItemPointer tid, ItemPointer ctid, CommandId cid, Snapshot crosscheck, bool wait); extern int heap_update(Relation relation, ItemPointer otid, HeapTuple tup, *************** *** 166,171 **** --- 166,172 ---- extern int heap_mark4update(Relation relation, HeapTuple tup, Buffer *userbuf, CommandId cid); + extern Oid partialfill_heap_insert(Relation relation, HeapTuple tup, Size requiredFreespace); extern Oid simple_heap_insert(Relation relation, HeapTuple tup); extern void simple_heap_delete(Relation relation, ItemPointer tid); extern void simple_heap_update(Relation relation, ItemPointer otid, *************** *** 205,208 **** --- 206,212 ---- extern void heap_freetuple(HeapTuple tuple); extern HeapTuple heap_addheader(int natts, bool withoid, Size structlen, void *structure); + #define copyFreespace ((Size) 0x000003FF) /* 1023 bytes */ + #define zeroFreespace ((Size) 0x00000000) /* 0 bytes */ + #endif /* HEAPAM_H */ diff -c -r postgresql-8.0.1/src/include/access/hio.h postgresql-8.0.1-wip/src/include/access/hio.h *** postgresql-8.0.1/src/include/access/hio.h 2005-02-03 10:38:12.000000000 -0800 --- postgresql-8.0.1-wip/src/include/access/hio.h 2005-02-10 03:52:43.000000000 -0800 *************** *** 19,24 **** extern void RelationPutHeapTuple(Relation relation, Buffer buffer, HeapTuple tuple); extern Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer); #endif /* HIO_H */ --- 19,24 ---- extern void RelationPutHeapTuple(Relation relation, Buffer buffer, HeapTuple tuple); extern Buffer RelationGetBufferForTuple(Relation relation, Size len, ! Buffer otherBuffer, Size requiredFreespace); #endif /* HIO_H */
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings