On Fri, Aug 31, 2018 at 5:53 AM Andrey Borodin <x4...@yandex-team.ru> wrote:
> Hi hackers!
>
> I've been at the database conference and here everyone is talking about cache 
> prefetches.
>
> I've tried simple hack
>
> diff --git a/src/backend/access/nbtree/nbtsearch.c 
> b/src/backend/access/nbtree/nbtsearch.c
> index d3700bd082..ffddf553aa 100644
> --- a/src/backend/access/nbtree/nbtsearch.c
> +++ b/src/backend/access/nbtree/nbtsearch.c
> @@ -401,6 +401,13 @@ _bt_binsrch(Relation rel,
>
>                 /* We have low <= mid < high, so mid points at a real slot */
>
> +               OffsetNumber x = mid + 1 + ((high - mid + 1) / 2);
> +               if (x < high)
> +                       __builtin_prefetch (PageGetItem(page, 
> PageGetItemId(page, x)), 0, 2);
> +               x = low + ((mid - low) / 2);
> +               if (x > low)
> +                       __builtin_prefetch (PageGetItem(page, 
> PageGetItemId(page, x)), 0, 2);
> +
>                 result = _bt_compare(rel, keysz, scankey, page, mid);
>
>                 if (result >= cmpval)
>
> The idea is pretty simple - our search are cache erasing anyway, let's try to 
> get at least some of it by prefetching possible ways of binary search.
> And it seems to me that on a simple query
> > insert into x select (random()*1000000)::int from generate_series(1,1e7);
> it brings something like 2-4% of performance improvement on my laptop.
>
> Is there a reason why we do not use __builtin_prefetch? Have anyone tried to 
> use cache prefetching?

A related topic is the cache-unfriendliness of traditional binary
searches of sorted data.  Check out "Array Layouts for
Comparison-Based Searching"[1] if you haven't already.  It says that
if your array fits in L2 cache, as our btree pages do, then binary
search still wins against Eytzinger et al, which I was a bit
disappointed about when I was (casually) investigating new layouts
based on some comments from a fellow hacker (I can't remember if it
was Andres or Peter G who mentioned this topic to me).  However, the
paper is talking about "branch-free binary search with explicit
prefetch", which apparently eats plain old branchy binary search for
breakfast, and I gather they tested on a lot of hardware.  That sounds
interesting to look into.

[1] https://arxiv.org/pdf/1509.05053.pdf

-- 
Thomas Munro
http://www.enterprisedb.com

Reply via email to