On Fri, Mar 2, 2018 at 3:57 PM, Andres Freund <and...@anarazel.de> wrote: > Based on this sub-thread this patch's status of 'needs review' doesn't > quite seem accurate and 'waiting on author' and then 'returned with > feedback' would be more fitting?
I personally think this patch is really close to RFC. Shubham has fulfilled the project requirement, it's a tidy and short patch, it has tests. I think we really just need to verify that the split case works correctly. Hmm. I notice that this calls PredicateLockPageSplit() after both calls to _hash_splitbucket() (the one in _hash_finish_split() and the one in _hash_expandtable()) instead of doing it inside that function, and that _hash_splitbucket() unlocks bucket_nbuf before returning. What if someone else accesses bucket_nbuf between LockBuffer(bucket_nbuf, BUFFER_LOCK_UNLOCK) and PredicateLockPageSplit()? Doesn't that mean that another session can read a newly created page and miss a predicate lock that is about to be transferred to it? If that is indeed a race, could it be fixed by calling PredicateLockPageSplit() at the start of _hash_splitbucket() instead? Could we get a few days to mull over this and Shubham's other patches? It'd be really great to get some of these into 11. My thought experiments about pseudo-pages and avoiding the split stuff were not intended to get the patch kicked out. I thought for a while that hash indexes were a special case and could benefit from dispensing with those trickier problems. Upon further reflection, for interesting size hash indexes pure hash value predicate tags wouldn't be much better. Furthermore, if we do decide we want to use using x % max_predicate_locks_per_relation to avoid having to escalate to relation predicate locks at the cost of slightly higher collision rate then we should consider that for the whole system (including heap page predicate locking), not just hash indexes. Please consider those ideas parked for now. -- Thomas Munro http://www.enterprisedb.com