I wrote: > After looking a bit at gist and sp-gist, neither of them would find that > terribly convenient; they really want to create one blob of memory per > index entry so as to not complicate storage management too much. But > they'd be fine with making that blob be a HeapTuple not IndexTuple. > So maybe the right approach is to expand the existing API to allow the > AM to return *either* a heap or index tuple; that could be made to not > be an API break.
Here's a draft patch along those lines. With this approach, btree doesn't need to be touched at all, since what it's returning certainly is an IndexTuple anyway. I fixed both SPGIST and GIST to use HeapTuple return format. It's not very clear to me whether GIST has a similar hazard with very large return values, but it might, and it's simple enough to change. regards, tom lane
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index 40f201b..d4af010 100644 *** a/doc/src/sgml/indexam.sgml --- b/doc/src/sgml/indexam.sgml *************** amgettuple (IndexScanDesc scan, *** 535,549 **** <para> If the index supports <link linkend="indexes-index-only-scans">index-only scans</link> (i.e., <function>amcanreturn</function> returns TRUE for it), ! then on success the AM must also check ! <literal>scan->xs_want_itup</>, and if that is true it must return ! the original indexed data for the index entry, in the form of an <structname>IndexTuple</> pointer stored at <literal>scan->xs_itup</>, ! with tuple descriptor <literal>scan->xs_itupdesc</>. ! (Management of the data referenced by the pointer is the access method's responsibility. The data must remain good at least until the next <function>amgettuple</>, <function>amrescan</>, or <function>amendscan</> ! call for the scan.) </para> <para> --- 535,553 ---- <para> If the index supports <link linkend="indexes-index-only-scans">index-only scans</link> (i.e., <function>amcanreturn</function> returns TRUE for it), ! then on success the AM must also check <literal>scan->xs_want_itup</>, ! and if that is true it must return the originally indexed data for the ! index entry. The data can be returned in the form of an <structname>IndexTuple</> pointer stored at <literal>scan->xs_itup</>, ! with tuple descriptor <literal>scan->xs_itupdesc</>; or in the form of ! a <structname>HeapTuple</> pointer stored at <literal>scan->xs_hitup</>, ! with tuple descriptor <literal>scan->xs_hitupdesc</>. (The latter ! format should be used when reconstructing data that might possibly not fit ! into an IndexTuple.) In either case, ! management of the data referenced by the pointer is the access method's responsibility. The data must remain good at least until the next <function>amgettuple</>, <function>amrescan</>, or <function>amendscan</> ! call for the scan. </para> <para> diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c index eea366b..122dc38 100644 *** a/src/backend/access/gist/gistget.c --- b/src/backend/access/gist/gistget.c *************** gistScanPage(IndexScanDesc scan, GISTSea *** 441,452 **** so->pageData[so->nPageData].offnum = i; /* ! * In an index-only scan, also fetch the data from the tuple. */ if (scan->xs_want_itup) { oldcxt = MemoryContextSwitchTo(so->pageDataCxt); ! so->pageData[so->nPageData].ftup = gistFetchTuple(giststate, r, it); MemoryContextSwitchTo(oldcxt); } --- 441,453 ---- so->pageData[so->nPageData].offnum = i; /* ! * In an index-only scan, also fetch the data from the tuple. The ! * reconstructed tuples are stored in pageDataCxt. */ if (scan->xs_want_itup) { oldcxt = MemoryContextSwitchTo(so->pageDataCxt); ! so->pageData[so->nPageData].recontup = gistFetchTuple(giststate, r, it); MemoryContextSwitchTo(oldcxt); } *************** gistScanPage(IndexScanDesc scan, GISTSea *** 478,484 **** * In an index-only scan, also fetch the data from the tuple. */ if (scan->xs_want_itup) ! item->data.heap.ftup = gistFetchTuple(giststate, r, it); } else { --- 479,485 ---- * In an index-only scan, also fetch the data from the tuple. */ if (scan->xs_want_itup) ! item->data.heap.recontup = gistFetchTuple(giststate, r, it); } else { *************** getNextNearest(IndexScanDesc scan) *** 540,550 **** bool res = false; int i; ! if (scan->xs_itup) { /* free previously returned tuple */ ! pfree(scan->xs_itup); ! scan->xs_itup = NULL; } do --- 541,551 ---- bool res = false; int i; ! if (scan->xs_hitup) { /* free previously returned tuple */ ! pfree(scan->xs_hitup); ! scan->xs_hitup = NULL; } do *************** getNextNearest(IndexScanDesc scan) *** 601,607 **** /* in an index-only scan, also return the reconstructed tuple. */ if (scan->xs_want_itup) ! scan->xs_itup = item->data.heap.ftup; res = true; } else --- 602,608 ---- /* in an index-only scan, also return the reconstructed tuple. */ if (scan->xs_want_itup) ! scan->xs_hitup = item->data.heap.recontup; res = true; } else *************** gistgettuple(IndexScanDesc scan, ScanDir *** 685,691 **** /* in an index-only scan, also return the reconstructed tuple */ if (scan->xs_want_itup) ! scan->xs_itup = so->pageData[so->curPageData].ftup; so->curPageData++; --- 686,692 ---- /* in an index-only scan, also return the reconstructed tuple */ if (scan->xs_want_itup) ! scan->xs_hitup = so->pageData[so->curPageData].recontup; so->curPageData++; diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c index 33b3889..81ff8fc 100644 *** a/src/backend/access/gist/gistscan.c --- b/src/backend/access/gist/gistscan.c *************** gistrescan(IndexScanDesc scan, ScanKey k *** 155,161 **** * tuple descriptor to represent the returned index tuples and create a * memory context to hold them during the scan. */ ! if (scan->xs_want_itup && !scan->xs_itupdesc) { int natts; int attno; --- 155,161 ---- * tuple descriptor to represent the returned index tuples and create a * memory context to hold them during the scan. */ ! if (scan->xs_want_itup && !scan->xs_hitupdesc) { int natts; int attno; *************** gistrescan(IndexScanDesc scan, ScanKey k *** 174,181 **** scan->indexRelation->rd_opcintype[attno - 1], -1, 0); } ! scan->xs_itupdesc = so->giststate->fetchTupdesc; so->pageDataCxt = AllocSetContextCreate(so->giststate->scanCxt, "GiST page data context", ALLOCSET_DEFAULT_SIZES); --- 174,182 ---- scan->indexRelation->rd_opcintype[attno - 1], -1, 0); } ! scan->xs_hitupdesc = so->giststate->fetchTupdesc; + /* Also create a memory context that will hold the returned tuples */ so->pageDataCxt = AllocSetContextCreate(so->giststate->scanCxt, "GiST page data context", ALLOCSET_DEFAULT_SIZES); diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c index f92baed..75845ba 100644 *** a/src/backend/access/gist/gistutil.c --- b/src/backend/access/gist/gistutil.c *************** gistFetchAtt(GISTSTATE *giststate, int n *** 624,632 **** /* * Fetch all keys in tuple. ! * returns new IndexTuple that contains GISTENTRY with fetched data */ ! IndexTuple gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple) { MemoryContext oldcxt = MemoryContextSwitchTo(giststate->tempCxt); --- 624,632 ---- /* * Fetch all keys in tuple. ! * Returns a new HeapTuple containing the originally-indexed data. */ ! HeapTuple gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple) { MemoryContext oldcxt = MemoryContextSwitchTo(giststate->tempCxt); *************** gistFetchTuple(GISTSTATE *giststate, Rel *** 660,666 **** } MemoryContextSwitchTo(oldcxt); ! return index_form_tuple(giststate->fetchTupdesc, fetchatt, isnull); } float --- 660,666 ---- } MemoryContextSwitchTo(oldcxt); ! return heap_form_tuple(giststate->fetchTupdesc, fetchatt, isnull); } float diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c index c4a393f..3599476 100644 *** a/src/backend/access/index/genam.c --- b/src/backend/access/index/genam.c *************** RelationGetIndexScan(Relation indexRelat *** 119,124 **** --- 119,126 ---- scan->xs_itup = NULL; scan->xs_itupdesc = NULL; + scan->xs_hitup = NULL; + scan->xs_hitupdesc = NULL; ItemPointerSetInvalid(&scan->xs_ctup.t_self); scan->xs_ctup.t_data = NULL; diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index 4822af9..89d1971 100644 *** a/src/backend/access/index/indexam.c --- b/src/backend/access/index/indexam.c *************** index_getnext_tid(IndexScanDesc scan, Sc *** 409,416 **** /* * The AM's amgettuple proc finds the next index entry matching the scan * keys, and puts the TID into scan->xs_ctup.t_self. It should also set ! * scan->xs_recheck and possibly scan->xs_itup, though we pay no attention ! * to those fields here. */ found = scan->indexRelation->rd_amroutine->amgettuple(scan, direction); --- 409,416 ---- /* * The AM's amgettuple proc finds the next index entry matching the scan * keys, and puts the TID into scan->xs_ctup.t_self. It should also set ! * scan->xs_recheck and possibly scan->xs_itup/scan->xs_hitup, though we ! * pay no attention to those fields here. */ found = scan->indexRelation->rd_amroutine->amgettuple(scan, direction); diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c index 139d998..2d96c00 100644 *** a/src/backend/access/spgist/spgscan.c --- b/src/backend/access/spgist/spgscan.c *************** resetSpGistScanOpaque(SpGistScanOpaque s *** 92,102 **** if (so->want_itup) { ! /* Must pfree IndexTuples to avoid memory leak */ int i; for (i = 0; i < so->nPtrs; i++) ! pfree(so->indexTups[i]); } so->iPtr = so->nPtrs = 0; } --- 92,102 ---- if (so->want_itup) { ! /* Must pfree reconstructed tuples to avoid memory leak */ int i; for (i = 0; i < so->nPtrs; i++) ! pfree(so->reconTups[i]); } so->iPtr = so->nPtrs = 0; } *************** spgbeginscan(Relation rel, int keysz, in *** 195,202 **** "SP-GiST search temporary context", ALLOCSET_DEFAULT_SIZES); ! /* Set up indexTupDesc and xs_itupdesc in case it's an index-only scan */ ! so->indexTupDesc = scan->xs_itupdesc = RelationGetDescr(rel); scan->opaque = so; --- 195,202 ---- "SP-GiST search temporary context", ALLOCSET_DEFAULT_SIZES); ! /* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */ ! so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel); scan->opaque = so; *************** storeGettuple(SpGistScanOpaque so, ItemP *** 591,602 **** if (so->want_itup) { /* ! * Reconstruct desired IndexTuple. We have to copy the datum out of ! * the temp context anyway, so we may as well create the tuple here. */ ! so->indexTups[so->nPtrs] = index_form_tuple(so->indexTupDesc, ! &leafValue, ! &isnull); } so->nPtrs++; } --- 591,602 ---- if (so->want_itup) { /* ! * Reconstruct index data. We have to copy the datum out of the temp ! * context anyway, so we may as well create the tuple here. */ ! so->reconTups[so->nPtrs] = heap_form_tuple(so->indexTupDesc, ! &leafValue, ! &isnull); } so->nPtrs++; } *************** spggettuple(IndexScanDesc scan, ScanDire *** 619,636 **** /* continuing to return tuples from a leaf page */ scan->xs_ctup.t_self = so->heapPtrs[so->iPtr]; scan->xs_recheck = so->recheck[so->iPtr]; ! scan->xs_itup = so->indexTups[so->iPtr]; so->iPtr++; return true; } if (so->want_itup) { ! /* Must pfree IndexTuples to avoid memory leak */ int i; for (i = 0; i < so->nPtrs; i++) ! pfree(so->indexTups[i]); } so->iPtr = so->nPtrs = 0; --- 619,636 ---- /* continuing to return tuples from a leaf page */ scan->xs_ctup.t_self = so->heapPtrs[so->iPtr]; scan->xs_recheck = so->recheck[so->iPtr]; ! scan->xs_hitup = so->reconTups[so->iPtr]; so->iPtr++; return true; } if (so->want_itup) { ! /* Must pfree reconstructed tuples to avoid memory leak */ int i; for (i = 0; i < so->nPtrs; i++) ! pfree(so->reconTups[i]); } so->iPtr = so->nPtrs = 0; diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c index ddef3a4..62bdeb3 100644 *** a/src/backend/executor/nodeIndexonlyscan.c --- b/src/backend/executor/nodeIndexonlyscan.c *************** IndexOnlyNext(IndexOnlyScanState *node) *** 144,152 **** } /* ! * Fill the scan tuple slot with data from the index. */ ! StoreIndexTuple(slot, scandesc->xs_itup, scandesc->xs_itupdesc); /* * If the index was lossy, we have to recheck the index quals. --- 144,169 ---- } /* ! * Fill the scan tuple slot with data from the index. This might be ! * provided in either HeapTuple or IndexTuple format. Conceivably an ! * index AM might fill both fields, in which case we prefer the heap ! * format, since it's probably a bit cheaper to fill a slot from. */ ! if (scandesc->xs_hitup) ! { ! /* ! * We don't take the trouble to verify that the provided tuple has ! * exactly the slot's format, but it seems worth doing a quick ! * check on the number of fields. ! */ ! Assert(slot->tts_tupleDescriptor->natts == ! scandesc->xs_hitupdesc->natts); ! ExecStoreTuple(scandesc->xs_hitup, slot, InvalidBuffer, false); ! } ! else if (scandesc->xs_itup) ! StoreIndexTuple(slot, scandesc->xs_itup, scandesc->xs_itupdesc); ! else ! elog(ERROR, "no data returned for index-only scan"); /* * If the index was lossy, we have to recheck the index quals. diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h index 72f52d4..6522336 100644 *** a/src/include/access/gist_private.h --- b/src/include/access/gist_private.h *************** typedef struct GISTSearchHeapItem *** 120,126 **** ItemPointerData heapPtr; bool recheck; /* T if quals must be rechecked */ bool recheckDistances; /* T if distances must be rechecked */ ! IndexTuple ftup; /* data fetched back from the index, used in * index-only scans */ OffsetNumber offnum; /* track offset in page to mark tuple as * LP_DEAD */ --- 120,126 ---- ItemPointerData heapPtr; bool recheck; /* T if quals must be rechecked */ bool recheckDistances; /* T if distances must be rechecked */ ! HeapTuple recontup; /* data reconstructed from the index, used in * index-only scans */ OffsetNumber offnum; /* track offset in page to mark tuple as * LP_DEAD */ *************** extern void gistMakeUnionItVec(GISTSTATE *** 529,535 **** extern bool gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b); extern void gistDeCompressAtt(GISTSTATE *giststate, Relation r, IndexTuple tuple, Page p, OffsetNumber o, GISTENTRY *attdata, bool *isnull); ! extern IndexTuple gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple); extern void gistMakeUnionKey(GISTSTATE *giststate, int attno, GISTENTRY *entry1, bool isnull1, --- 529,535 ---- extern bool gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b); extern void gistDeCompressAtt(GISTSTATE *giststate, Relation r, IndexTuple tuple, Page p, OffsetNumber o, GISTENTRY *attdata, bool *isnull); ! extern HeapTuple gistFetchTuple(GISTSTATE *giststate, Relation r, IndexTuple tuple); extern void gistMakeUnionKey(GISTSTATE *giststate, int attno, GISTENTRY *entry1, bool isnull1, diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h index 8746045..8635f83 100644 *** a/src/include/access/relscan.h --- b/src/include/access/relscan.h *************** typedef struct IndexScanDescData *** 103,111 **** /* index access method's private state */ void *opaque; /* access-method-specific info */ ! /* in an index-only scan, this is valid after a successful amgettuple */ IndexTuple xs_itup; /* index tuple returned by AM */ TupleDesc xs_itupdesc; /* rowtype descriptor of xs_itup */ /* xs_ctup/xs_cbuf/xs_recheck are valid after a successful index_getnext */ HeapTupleData xs_ctup; /* current heap tuple, if any */ --- 103,118 ---- /* index access method's private state */ void *opaque; /* access-method-specific info */ ! /* ! * In an index-only scan, a successful amgettuple call must fill either ! * xs_itup (and xs_itupdesc) or xs_hitup (and xs_hitupdesc) to provide the ! * data returned by the scan. It can fill both, in which case the heap ! * format will be used. ! */ IndexTuple xs_itup; /* index tuple returned by AM */ TupleDesc xs_itupdesc; /* rowtype descriptor of xs_itup */ + HeapTuple xs_hitup; /* index data returned by AM, as HeapTuple */ + TupleDesc xs_hitupdesc; /* rowtype descriptor of xs_hitup */ /* xs_ctup/xs_cbuf/xs_recheck are valid after a successful index_getnext */ HeapTupleData xs_ctup; /* current heap tuple, if any */ diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h index b2979a9..67d45e8 100644 *** a/src/include/access/spgist_private.h --- b/src/include/access/spgist_private.h *************** typedef struct SpGistScanOpaqueData *** 159,165 **** int iPtr; /* index for scanning through same */ ItemPointerData heapPtrs[MaxIndexTuplesPerPage]; /* TIDs from cur page */ bool recheck[MaxIndexTuplesPerPage]; /* their recheck flags */ ! IndexTuple indexTups[MaxIndexTuplesPerPage]; /* reconstructed tuples */ /* * Note: using MaxIndexTuplesPerPage above is a bit hokey since --- 159,165 ---- int iPtr; /* index for scanning through same */ ItemPointerData heapPtrs[MaxIndexTuplesPerPage]; /* TIDs from cur page */ bool recheck[MaxIndexTuplesPerPage]; /* their recheck flags */ ! HeapTuple reconTups[MaxIndexTuplesPerPage]; /* reconstructed tuples */ /* * Note: using MaxIndexTuplesPerPage above is a bit hokey since
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers