22.01.2016 13:48, Aleksander Alekseev:
Then, this thread became too tangled. I think it's worth to write a
new message with the patch, the test script, some results and brief
overview of how does it really works. It will make following review
much easier.

HASHHDR represents a hash table. It could be usual or partitioned.
Partitioned table is stored in a shared memory and accessed by multiple
processes simultaneously. To prevent data corruption hash table is
partitioned and each process has to acquire a lock for a corresponding
partition before accessing data in it. Number of partition is determine
by lower bits of key's hash value. Most tricky part is --- dynahash
knows nothing about these locks, all locking is done on calling side.

Since shared memory is pre-allocated and can't grow to allocate memory
in a hash table we use freeList. Also we use nentries field to count
current number of elements in a hash table. Since hash table is used by
multiple processes these fields are protected by a spinlock. Which as
it turned out could cause lock contention and create a bottleneck.

After trying a few approaches I discovered that using one spinlock per
partition works quite well. Here are last benchmark results:


Note that "no locks" solution cant be used because it doesn't guarantee
that all available memory will be used in some corner cases.

You can find a few more details and a test script in the first message
of this thread. If you have any other questions regarding this patch
please don't hesitate to ask.
     * Compute init/max size to request for lock hashtables.  Note these
     * calculations must agree with LockShmemSize!

This comment certainly requires some changes.
BTW, could you explain why init_table_size was two times less than max_table_size?
Although, I don't see any problems with your changes.

-    hctl->nentries = 0;
-    hctl->freeList = NULL;

Why did you delete these two lines? I wonder if you should rewrite them instead?

+ * This particular number of partitions significantly reduces lock contention
+ * in partitioned hash tables, almost if partitioned tables didn't use any
+ * locking at all

As far as I understood, this number was obtained experimentally? Maybe you should note that in the comment.

And the last, but not least.

+        nelem_alloc = ((int) nelem) / partitions_number;
+        if (nelem_alloc == 0)
+            nelem_alloc = 1;
+        for (i = 0; i < partitions_number; i++)
+            if (!element_alloc(hashp, nelem_alloc, i))
+                ereport(ERROR,
+                        (errcode(ERRCODE_OUT_OF_MEMORY),
+                         errmsg("out of memory")));

It looks like you forgot to round the result of the division.
For example, if you have nelem=25 and partitions_number=6.
25 / 6 = 4. And then you allocate 24 nelems, while 1 is lost.

What about productivity, I did a test on my notebook. Patched version shows small performance improvement.

pgbench -j 1 -c 1 -f pgbench.sql -T 300 postgres
base patched
127tps  145tps
pgbench -j 8 -c 8 -f pgbench.sql -T 300 postgres
base patched
247tps  270tps

But I haven't any big machine to perform tests, where the patch is supposed to make significant changes.
So I would rely on your and the other reviewers results.

Except mentioned notes, I suppose the patch is good enough to pass it to a reviewer.

Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to