Re: [Patch] Optimize dropping of relation buffers using dlist

Konstantin Knizhnik Fri, 07 Aug 2020 00:09:14 -0700



On 07.08.2020 00:33, Tomas Vondra wrote:


Unfortunately Konstantin did not share any details about what workloads
he tested, what config etc. But I find the "no regression" hypothesis
rather hard to believe, because we're adding non-trivial amount of code
to a place that can be quite hot.


Sorry, that I have not explained  my test scenarios.
As far as Postgres is pgbench-oriented database:) I have also used pgbench:
read-only case and sip-some updates.
For this patch most critical is number of buffer allocations,

so I used small enough database (scale=100), but shared buffer was setto 1Gb.As a result, all data is cached in memory (in file system cache), butthere is intensive swapping at Postgres buffer manager level.I have tested it both with relatively small (100) and large (1000)number of clients.I repeated this tests at my notebook (quadcore, 16Gb RAM, SSD) and IBMPower2 server with about 380 virtual cores and about 1Tb of memory.I the last case results are vary very much I think because of NUMAarchitecture) but I failed to find some noticeable regression of patchedversion.

But I have to agree that adding parallel hash (in addition to existedbuffer manager hash) is not so good idea.

This cache really quite frequently becomes bottleneck.

My explanation of why I have not observed some noticeable regression wasthat this patch uses almost the same lock partitioning schemaas already used it adds not so much new conflicts. May be in case ofPOwer2 server, overhead of NUMA is much higher than other factors(although shared hash is one of the main thing suffering from NUMAarchitecture).But in principle I agree that having two independent caches may decreasespeed up to two times (or even more).

I hope that everybody will agree that this problem is really critical.It is certainly not the most common case when there are hundreds ofrelation which are frequently truncated. But having quadratic complexityin drop function is not acceptable from my point of view.And it is not only recovery-specific problem, this is why solution withlocal cache is not enough.


I do not know good solution of the problem. Just some thoughts.

- We can somehow combine locking used for main buffer manager cache (byrelid/blockno) and cache for relid. It will eliminates double lockingoverhead.- We can use something like sorted tree (like std::map) instead of hash- it will allow to locate blocks both by relid/blockno and by relid only.

Re: [Patch] Optimize dropping of relation buffers using dlist

Reply via email to