22.01.2016 13:48, Aleksander Alekseev:
Then, this thread became too tangled. I think it's worth to write a
new message with the patch, the test script, some results and brief
overview of how does it really works. It will make following review
HASHHDR represents a hash table. It could be usual or partitioned.
Partitioned table is stored in a shared memory and accessed by multiple
processes simultaneously. To prevent data corruption hash table is
partitioned and each process has to acquire a lock for a corresponding
partition before accessing data in it. Number of partition is determine
by lower bits of key's hash value. Most tricky part is --- dynahash
knows nothing about these locks, all locking is done on calling side.
Since shared memory is pre-allocated and can't grow to allocate memory
in a hash table we use freeList. Also we use nentries field to count
current number of elements in a hash table. Since hash table is used by
multiple processes these fields are protected by a spinlock. Which as
it turned out could cause lock contention and create a bottleneck.
After trying a few approaches I discovered that using one spinlock per
partition works quite well. Here are last benchmark results:
Note that "no locks" solution cant be used because it doesn't guarantee
that all available memory will be used in some corner cases.
You can find a few more details and a test script in the first message
of this thread. If you have any other questions regarding this patch
please don't hesitate to ask.
* Compute init/max size to request for lock hashtables. Note these
* calculations must agree with LockShmemSize!
This comment certainly requires some changes.
BTW, could you explain why init_table_size was two times less than
Although, I don't see any problems with your changes.
- hctl->nentries = 0;
- hctl->freeList = NULL;
Why did you delete these two lines? I wonder if you should rewrite them
+ * This particular number of partitions significantly reduces lock
+ * in partitioned hash tables, almost if partitioned tables didn't use any
+ * locking at all
As far as I understood, this number was obtained experimentally? Maybe
you should note that in the comment.
And the last, but not least.
+ nelem_alloc = ((int) nelem) / partitions_number;
+ if (nelem_alloc == 0)
+ nelem_alloc = 1;
+ for (i = 0; i < partitions_number; i++)
+ if (!element_alloc(hashp, nelem_alloc, i))
+ errmsg("out of memory")));
It looks like you forgot to round the result of the division.
For example, if you have nelem=25 and partitions_number=6.
25 / 6 = 4. And then you allocate 24 nelems, while 1 is lost.
What about productivity, I did a test on my notebook. Patched version
shows small performance improvement.
pgbench -j 1 -c 1 -f pgbench.sql -T 300 postgres
pgbench -j 8 -c 8 -f pgbench.sql -T 300 postgres
But I haven't any big machine to perform tests, where the patch is
supposed to make significant changes.
So I would rely on your and the other reviewers results.
Except mentioned notes, I suppose the patch is good enough to pass it to
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company