Re: [HACKERS] avoiding tuple copying in btree index builds

Robert Haas Mon, 05 May 2014 11:36:06 -0700

On Mon, May 5, 2014 at 2:13 PM, Andres Freund <[email protected]> wrote:
> On 2014-05-05 13:52:39 -0400, Robert Haas wrote:
>> Today, I discovered that when building a btree index, the btree code
>> uses index_form_tuple() to create an index tuple from the heap tuple,
>> calls tuplesort_putindextuple() to copy that tuple into the sort's
>> memory context, and then frees the original one it built.  This seemed
>> inefficient, so I wrote a patch to eliminate the tuple copying.  It
>> works by adding a function tuplesort_putindextuplevalues(), which
>> builds the tuple in the sort's memory context and thus avoids the need
>> for a separate copy.  I'm not sure if that's the best approach, but
>> the optimization seems wortwhile.
>
> Hm. It looks like we could quite easily just get rid of
> tuplesort_putindextuple(). The hash usage doesn't look hard to convert.


I glanced at that, but it wasn't obvious to me how to convert the hash
usage.  If you have an idea, I'm all ears.

>> I tested it by repeatedly executing "REINDEX INDEX
>> pgbench_accounts_pkey" on a PPC64 machine.  pgbench_accounts contains
>> 10 million records.  With unpatched master as of
>> b2f7bd72c4d3e80065725c72e85778d5f4bdfd4a, I got times of 6.159s,
>> 6.177s, and 6.201s.  With the attached patch, I got times of 5.787s,
>> 5.972s, and 5.913s, a savings of almost 5%.  Not bad considering the
>> amount of work involved.
>
> Yes, that's certainly worthwile. Nice.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] avoiding tuple copying in btree index builds

Reply via email to