Optimize hash index bulk-deletion with streaming read This commit refactors hashbulkdelete() to use streaming reads, improving the efficiency of the operation by prefetching upcoming buckets while processing a current bucket. There are some specific changes required to make sure that the cleanup work happens in accordance to the data pushed to the stream read callback. When the cached metadata page is refreshed to be able to process the next set of buckets, the stream is reset and the data fed to the stream read callback has to be updated. The reset needs to happen in two code paths, when _hash_getcachedmetap() is called.
The author has seen better performance numbers than myself on this one (with tweaks similar to 6c228755add8). The numbers are good enough for both of us that this change is worth doing, in terms of IO and runtime. Author: Xuneng Zhou <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=k0rzexnuw6u_ggqzsju32w...@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/bfa3c4f106b1fb858ead1c8f05332f09d34f664a Modified Files -------------- src/backend/access/hash/hash.c | 80 ++++++++++++++++++++++++++++++++++++++-- src/tools/pgindent/typedefs.list | 1 + 2 files changed, 78 insertions(+), 3 deletions(-)
