On Thu, Jul 25, 2024 at 2:58 AM Hayato Kuroda (Fujitsu) <kuroda.hay...@fujitsu.com> wrote: > > Dear Sawada-san, > > > Thank you for the test! > > > > I could reproduce this issue and it's a bug; it skipped even > > non-all-visible pages. I've attached the new version patch. > > > > BTW since we compute the number of parallel workers for the heap scan > > based on the table size, it's possible that we launch multiple workers > > even if most blocks are all-visible. It seems to be better if we > > calculate it based on (relpages - relallvisible). > > Thanks for updating the patch. I applied and confirmed all pages are scanned. > I used almost the same script (just changed max_parallel_maintenance_workers) > and got below result. I think the tendency was the same as yours. > > ``` > parallel 0: 61114.369 ms > parallel 1: 34870.316 ms > parallel 2: 23598.115 ms > parallel 3: 17732.363 ms > parallel 4: 15203.271 ms > parallel 5: 13836.025 ms > ```
Thank you for testing! > > I started to read your codes but takes much time because I've never seen > before... > Below part contains initial comments. > > 1. > This patch cannot be built when debug mode is enabled. See [1]. > IIUC, this was because NewRelminMxid was moved from struct LVRelState to > PHVShared. > So you should update like " vacrel->counters->NewRelminMxid". Right, will fix. > 2. > I compared parallel heap scan and found that it does not have compute_worker > API. > Can you clarify the reason why there is an inconsistency? > (I feel it is intentional because the calculation logic seems to depend on > the heap structure, > so should we add the API for table scan as well?) There is room to consider a better API design, but yes, the reason is that the calculation logic depends on table AM implementation. For example, I thought it might make sense to consider taking the number of all-visible pages into account for the calculation of the number of parallel workers as we don't want to launch many workers on the table where most pages are all-visible. Which might not work for other table AMs. I'm updating the patch to implement parallel heap vacuum and will share the updated patch. It might take time as it requires to implement shared iteration support in radx tree. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com