On Sat, Mar 27, 2021 at 6:31 PM Andrey Borodin <x4...@yandex-team.ru> wrote:
> > 27 марта 2021 г., в 01:26, Thomas Munro <thomas.mu...@gmail.com> написал(а):
> > , and murmurhash which is inlineable and
> > branch-free.

> I think pageno is a hash already. Why hash any further? And pages accessed 
> together will have smaller access time due to colocation.

Yeah, if clog_buffers is large enough then it's already a "perfect
hash", but if it's not then you might get some weird "harmonic"
effects (not sure if that's the right word), basically higher or lower
collision rate depending on coincidences in the data.  If you apply a
hash, the collisions should be evenly spread out so at least it'll be
somewhat consistent.  Does that make sense?

(At some point I figured out that the syscaches have lower collision
rates and perform better if you use oids directly instead of hashing
them... but then it's easy to create a pathological pattern of DDL
that turns your hash table into a linked list.  Not sure what to think
about that.)

> >  I had to tweak it to support "in-place" creation and
> > fixed size (in other words, no allocators, for use in shared memory).

> We really need to have a test to know what happens when this structure goes 
> out of memory, as you mentioned below. What would be apropriate place for 
> simplehash tests?

Good questions.  This has to be based on being guaranteed to have
enough space for all of the entries, so the question is really just
"how bad can performance get with different load factors".  FWIW there
were some interesting cases with clustering when simplehash was first
used in the executor (see commits ab9f2c42 and parent) which required
some work on hashing quality to fix.

> > Then I was annoyed that I had to add a "status" member to our struct,
> > so I tried to fix that.

> Indeed, sizeof(SlruMappingTableEntry) == 9 seems strange. Will simplehash 
> align it well?

With that "intrusive status" patch, the size is back to 8.  But I
think I made a mistake: I made it steal some key space to indicate
presence, but I think the presence test should really get access to
the whole entry so that you can encode it in more ways.  For example,
with slotno == -1.

Alright, considering the date, if we want to get this into PostgreSQL
14 it's time to make some decisions.

1.  Do we want customisable SLRU sizes in PG14?

+1 from me, we have multiple reports of performance gains from
increasing various different SLRUs, and it's easy to find workloads
that go faster.

One thought: it'd be nice if the user could *see* the current size,
when using the default.  SHOW clog_buffers -> 0 isn't very helpful if
you want to increase it, but don't know what it's currently set to.
Not sure off the top of my head how best to do that.

2.  What names do we want the GUCs to have?  Here's what we have:

Proposed GUC              Directory            System views
clog_buffers              pg_xact              Xact
multixact_offsets_buffers pg_multixact/offsets MultiXactOffset
multixact_members_buffers pg_multixact/members MultiXactMember
notify_buffers            pg_notify            Notify
serial_buffers            pg_serial            Serial
subtrans_buffers          pg_subtrans          Subtrans
commit_ts_buffers         pg_commit_ts         CommitTs

By system views, I mean pg_stat_slru, pg_shmem_allocations and
pg_stat_activity (lock names add "SLRU" on the end).


It seems obvious that "clog_buffers" should be renamed to "xact_buffers".
It's not clear whether the multixact GUCs should have the extra "s"
like the directories, or not, like the system views.
It see that we have "Shared Buffer Lookup Table" in
pg_shmem_allocations, so where I generated names like "Subtrans
Mapping Table" I should change that to "Lookup" to match.

3.  What recommendations should we make about how to set it?

I think the answer depends partially on the next questions!  I think
we should probably at least say something short about the pg_stat_slru
view (cache miss rate) and pg_stat_actitity view (waits on locks), and
how to tell if you might need to increase it.  I think this probably
needs a new paragraph, separate from the docs for the individual GUC.

4.  Do we want to ship the dynahash patch?

+0.9.  The slight hesitation is that it's new code written very late
in the cycle, so it may still have bugs or unintended consequences,
and as you said, at small sizes the linear search must be faster than
the hash computation.  Could you help test it, and try to break it?
Can we quantify the scaling effect for some interesting workloads, to
see at what size the dynahash beats the linear search, so that we can
make an informed decision?  Of course, without a hash table, large
sizes will surely work badly, so it'd be tempting to restrict the size
you can set the GUC to.

If we do include the dynahash patch, then I think it would also be
reasonable to change the formula for the default, to make it higher on
large systems.  The restriction to 128 buffers (= 1MB) doesn't make
much sense on a high frequency OLTP system with 128GB of shared
buffers or even 4GB.  I think "unleashing better defaults" would
actually be bigger news than the GUC for typical users, because
they'll just see PG14 use a few extra MB and go faster without having
to learn about these obscure new settings.

5.  Do we want to ship the simplehash patch?

-0.5.  It's a bit too exciting for the last minute, so I'd be inclined
to wait until the next cycle to do some more research and testing.  I
know it's a better idea in the long run.

Reply via email to