On Thu, Jul 26, 2018 at 8:37 PM, Dmitry Dolgov <9erthali...@gmail.com> wrote: >> On Mon, 23 Jul 2018 at 10:38, Ashutosh Bapat >> <ashutosh.ba...@enterprisedb.com> wrote: >> >> On Fri, Jul 20, 2018 at 3:13 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote: >> > >> > It's of course wrong, it's going to be O(max(m, n)) as you said, but >> > the point is still valid - if we have partitions A1, A2 from one side >> > and B1, ..., BN on another side, we can skip necessary the >> > partitions from B that are between e.g. A1 and A2 faster with >> > binary search. >> >> That's possible only when there is no default partition and the join >> is an inner join. For an outer join, we need to include all the >> partitions on the outer side, so we can't just skip over some >> partitions. In case of a default partition, it can take place of a >> missing partition, so we can't skip partitions using binary search. > > But in this case I described above (when we have a partition A3 on one side, > and another matching partition B3 from other side, but separated by some other > partitions, e.g. B1, B2) it means that we will merge A3 with a default > partition from other side without actually needing that, am I right? In this > sense it's an overhead out of nothing.
No. We will join A3 with B3 since they have matching bounds. We will compare B1 and B2's bounds with A3 (assuming that there are no bounds before A3). Since they won't be compatible we will match default of A to B1 and B2. That will of-course fail since we will try to match multiple partitions to a single partition, but that's not the point of your question I think. You are right that we could skip comparing A3 with B1 and B2 and directly land on B3. Any partitions skipped in between will be matched with A's default partition. But as I have said this would be rare and complicating the logic for a rare case doesn't seem practical at this stage. Apart from the complexity there's also a possibility that this skipping will reduce the efficiency actually in normal cases. Consider a case where A and B have exactly matching partitions. Current partition matching algorithm compare any given range/bound only once and we will have O(n) algorithm. If we use a binary search, however, for every range comparison, it will be O(n log n) algorithm. There will be unnecessary comparisons during binary search. The current algorithm is always O(n), whereas binary search would be O(n log(n)) with a possibility of having sub-O(n) complexity in some cases. I would go for an algorithm which has a consistent time complexity always and which is efficient in normal cases, rather than worrying about some cases which are not practical. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company