On Tue, Oct 5, 2021 at 6:50 AM Simon Riggs <simon.ri...@enterprisedb.com> wrote: > With unique data, starting at 1 and monotonically ascending, hash > indexes will grow very nicely from 0 to 10E7 rows without causing >1 > overflow block to be allocated for any bucket. This keeps the search > time for such data to just 2 blocks (bucket plus, if present, 1 > overflow block). The small number of overflow blocks is because of the > regular and smooth way that splits occur, which works very nicely > without significant extra latency.
It is my impression that with non-unique data things degrade rather badly. There's no way to split the buckets that are overflowing without also splitting the buckets that are completely empty or, in any event, not full enough to need any overflow pages. I think that's rather awful. > The probability of bucket collision while we hold the lock is fairly > low. This is because even with adjacent data values the hash values > would be spread across multiple buckets, so we would expect the > contention to be less than we would get on a monotonically increasing > btree. > > So I don't now see any problem from holding the buffer lwlock on the > bucket while we do multi-buffer operations. I don't think that contention is the primary concern here. I think one major concern is interruptibility: a process must be careful not to hold lwlocks across long stretches of code, because it cannot be cancelled while it does. Even if the code is bug-free and the database has no corruption, long pauses before cancels take effect can be inconvenient. But as soon as you add in those considerations things get much worse. Imagine a hash index that is corrupted so that there is a loop in the list of overflow pages. No matter what, we're going to go into an infinite loop scanning that bucket, but if we're holding a buffer lock while we do it, there's no way to escape short of bouncing the entire server. That's pretty bad. Undetected deadlock is something to think about, too. -- Robert Haas EDB: http://www.enterprisedb.com