On Tue, May 6, 2014 at 12:04 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, May 5, 2014 at 2:13 PM, Andres Freund <and...@2ndquadrant.com> wrote: > > On 2014-05-05 13:52:39 -0400, Robert Haas wrote: > >> Today, I discovered that when building a btree index, the btree code > >> uses index_form_tuple() to create an index tuple from the heap tuple, > >> calls tuplesort_putindextuple() to copy that tuple into the sort's > >> memory context, and then frees the original one it built. This seemed > >> inefficient, so I wrote a patch to eliminate the tuple copying. It > >> works by adding a function tuplesort_putindextuplevalues(), which > >> builds the tuple in the sort's memory context and thus avoids the need > >> for a separate copy. I'm not sure if that's the best approach, but > >> the optimization seems wortwhile. > > > > Hm. It looks like we could quite easily just get rid of > > tuplesort_putindextuple(). The hash usage doesn't look hard to convert. > > I glanced at that, but it wasn't obvious to me how to convert the hash > usage. If you have an idea, I'm all ears.
I also think it's possible to have similar optimization for hash index incase it has to spool the tuple for sorting. In function hashbuildCallback(), when buildstate->spool is true, we can avoid to form index tuple. To check for nulls before calling _h_spool(), we can traverse the isnull array. It seems converting hash index usage is not as straightforward as btree index, but doesn't look too complex either. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com