I'd been thinking of this patch again. When testing with simplehash, I found that the width of the hash bucket type was fairly critical for getting good performance from simplehash.h. With simplehash.h I didn't manage to narrow this any more than 16 bytes. I needed to store the 32-bit hash value and a pointer to the data. On a 64-bit machine, with padding, that's 16-bytes. I've been thinking about a way to narrow this down further to just 8 bytes and also solve the stable pointer problem at the same time...
I've come up with a new hash table implementation that I've called generichash. It works similarly to simplehash in regards to the linear probing, only instead of storing the data in the hash bucket, we just store a uint32 index that indexes off into an array. To keep the pointers in that array stable, we cannot resize the array as the table grows. Instead, I just allocate another array of the same size. Since these arrays are always sized as powers of 2, it's very fast to index into them using the uint32 index that's stored in the bucket. Unused buckets just store the special index of 0xFFFFFFFF. I've also proposed to use this hash table implementation over in [1] to speed up LockReleaseAll(). The 0001 patch here is just the same as the patch from [1]. The 0002 patch includes using a generichash hash table for SMgr. The performance using generichash.h is about the same as the simplehash.h version of the patch. Although, the test was not done on the same version of master. Master (97b713418) drowley@amd3990x:~$ tail -f pg.log | grep "redo done" CPU: user: 124.85 s, system: 6.83 s, elapsed: 131.74 s CPU: user: 115.01 s, system: 4.76 s, elapsed: 119.83 s CPU: user: 122.13 s, system: 6.41 s, elapsed: 128.60 s CPU: user: 113.85 s, system: 6.11 s, elapsed: 120.02 s CPU: user: 121.40 s, system: 6.28 s, elapsed: 127.74 s CPU: user: 113.71 s, system: 5.80 s, elapsed: 119.57 s CPU: user: 113.96 s, system: 5.90 s, elapsed: 119.92 s CPU: user: 122.74 s, system: 6.21 s, elapsed: 129.01 s CPU: user: 122.00 s, system: 6.38 s, elapsed: 128.44 s CPU: user: 113.06 s, system: 6.14 s, elapsed: 119.25 s CPU: user: 114.42 s, system: 4.35 s, elapsed: 118.82 s Median: 120.02 s master + v1 + v2 drowley@amd3990x:~$ tail -n 0 -f pg.log | grep "redo done" CPU: user: 107.75 s, system: 4.61 s, elapsed: 112.41 s CPU: user: 108.07 s, system: 4.49 s, elapsed: 112.61 s CPU: user: 106.89 s, system: 5.55 s, elapsed: 112.49 s CPU: user: 107.42 s, system: 5.64 s, elapsed: 113.12 s CPU: user: 106.85 s, system: 4.42 s, elapsed: 111.31 s CPU: user: 107.36 s, system: 4.76 s, elapsed: 112.16 s CPU: user: 107.20 s, system: 4.47 s, elapsed: 111.72 s CPU: user: 106.94 s, system: 5.89 s, elapsed: 112.88 s CPU: user: 115.32 s, system: 6.12 s, elapsed: 121.49 s CPU: user: 108.02 s, system: 4.48 s, elapsed: 112.54 s CPU: user: 106.93 s, system: 4.54 s, elapsed: 111.51 s Median: 112.49 s So about a 6.69% speedup David [1] https://www.postgresql.org/message-id/caaphdvokqwrxw5nnupz8+majkhpopxygoy1gqdh0wes4+bi...@mail.gmail.com
v1-0001-Add-a-new-hash-table-type-which-has-stable-pointe.patch
Description: Binary data
v1-0002-Use-generichash.h-hashtables-in-SMgr.patch
Description: Binary data