On 15/10/2023 13:25, Alexander Korotkov wrote:
Great!  I'm looking forward to the revised patch.
Revising the code and opinions before restarting this work, I found two different possible strategies mentioned in the thread: 1. 'Common Resources' shares the materialised result of the inner table scan (a hash table in the case of HashJoin) to join each partition one by one. It gives us a profit in the case of parallel append and possibly other cases, like the one shown in the initial message. 2. 'Individual strategies' - By limiting the AJ feature to cases when the JOIN clause contains a partitioning expression, we can push an additional scan clause into each copy of the inner table scan, reduce the number of tuples scanned, and even prune something because of proven zero input.

I see the pros and cons of both approaches. The first option is more straightforward, and its outcome is obvious in the case of parallel append. But how can we guarantee the same join type for each join? Why should we ignore the positive effect of different strategies for different partitions? The second strategy is more expensive for the optimiser, especially in the multipartition case. But as I can predict, it is easier to implement and looks more natural for the architecture. What do you think about that?

--
regards,
Andrei Lepikhov
Postgres Professional



Reply via email to