Re: Speeding up GIST index creation for tsvectors

John Naylor Wed, 10 Mar 2021 10:09:10 -0800

On Mon, Mar 8, 2021 at 8:43 AM Amit Khandekar <[email protected]>
wrote:
>
> On Wed, 3 Mar 2021 at 23:32, John Naylor <[email protected]>
wrote:


> > 0001:
> >
> > + /*
> > + * We can process 64-bit chunks only if both are mis-aligned by the
same
> > + * number of bytes.
> > + */
> > + if (b_aligned - b == a_aligned - a)
> >
> > The obvious question here is: how often are they identically
misaligned? You
> > don't indicate that your measurements differ in a bimodal fashion, so
does
> > that mean they happen to be always (mis)aligned?
>
> I ran CREATE INDEX on tsvector columns using the tsearch.data that I
> had attached upthread, with some instrumentation; here are the
> proportions :
> 1. In 15% of the cases, only one among a and b was aligned. The other
> was offset from the 8-byte boundary by 4 bytes.
> 2. 6% of the cases, both were offset by 4 bytes, i.e. identically
misaligned.
> 3. Rest of the cases, both were aligned.

That's somewhat strange. I would have assumed always aligned, or usually
not. It sounds like we could require them both to be aligned and not bother
with the byte-by-byte prologue. I also wonder if it's worth it to memcpy to
a new buffer if the passed pointer is not aligned.

> > + /* For smaller lengths, do simple byte-by-byte traversal */
> > + if (bytes <= 32)
> >
> > You noted upthread:
> >
> > > Since for the gist index creation of some of these types the default
> > > value for siglen is small (8-20), I tested with small siglens. For
> > > siglens <= 20, particularly for values that are not multiples of 8
> > > (e.g. 10, 13, etc), I see a  1-7 % reduction in speed of index
> > > creation. It's probably because of
> > > an extra function call for pg_xorcount(); and also might be due to the
> > > extra logic in pg_xorcount() which becomes prominent for shorter
> > > traversals. So for siglen less than 32, I kept the existing method
> > > using byte-by-byte traversal.
> >
> > I wonder if that can be addressed directly, while cleaning up the loops
and
> > alignment checks in pg_xorcount_long() a little. For example, have a
look at
> > pg_crc32c_armv8.c -- it might be worth testing a similar approach.
>
> Yeah, we can put the bytes <= 32 condition inside pg_xorcount_long().
> I avoided that to not hamper the <= 32 scenarios. Details explained
> below for "why inline pg_xorcount is calling global function"
>
> > Also, pardon my ignorance, but where can I find the default siglen for
various types?
> Check SIGLEN_DEFAULT.

Okay, so it's hard-coded in various functions in contrib modules. In that
case, let's just keep the existing coding for those. In fact, the comments
that got removed by your patch specifically say:

/* Using the popcount functions here isn't likely to win */

...which your testing confirmed. The new function should be used only for
Gist and possibly Intarray, whose default is 124. That means we can't get
rid of hemdistsign(), but that's fine. Alternatively, we could say that a
small regression is justifiable reason to refactor all call sites, but I'm
not proposing that.

> > (I did make sure to remove  indirect calls  from the retail functions
> > in [1], in case we want to go that route).
>
> Yeah, I quickly had a look at that. I am still going over that thread.
> Thanks for the exhaustive analysis there.

I'll post a patch soon that builds on that, so you can see what I mean.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Speeding up GIST index creation for tsvectors

Reply via email to