On Wed, Oct 27, 2021 at 4:55 PM Matthias van de Meent <boekewurm+postg...@gmail.com> wrote: > > On Wed, 27 Oct 2021 at 12:58, Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Wed, Oct 27, 2021 at 2:32 AM Robert Haas <robertmh...@gmail.com> wrote: > > > > > > On Tue, Oct 5, 2021 at 6:50 AM Simon Riggs <simon.ri...@enterprisedb.com> > > > wrote: > > > > With unique data, starting at 1 and monotonically ascending, hash > > > > indexes will grow very nicely from 0 to 10E7 rows without causing >1 > > > > overflow block to be allocated for any bucket. This keeps the search > > > > time for such data to just 2 blocks (bucket plus, if present, 1 > > > > overflow block). The small number of overflow blocks is because of the > > > > regular and smooth way that splits occur, which works very nicely > > > > without significant extra latency. > > > > > > It is my impression that with non-unique data things degrade rather > > > badly. > > > > > > > But we will hold the bucket lock only for unique-index in which case > > there shouldn't be non-unique data in the index. > > Even in unique indexes there might be many duplicate index entries: A > frequently updated row, to which HOT cannot apply, whose row versions > are waiting for vacuum (which is waiting for that one long-running > transaction to commit) will have many entries in each index. > > Sure, it generally won't hit 10E7 duplicates, but we can hit large > numbers of duplicates fast on a frequently updated row. Updating one > row 1000 times between two runs of VACUUM is not at all impossible, > and although I don't think it happens all the time, I do think it can > happen often enough on e.g. an HTAP system to make it a noteworthy > test case. >
I think it makes to test such cases and see the behavior w.r.t overflow buckets. -- With Regards, Amit Kapila.