> On 22 March 2018 at 14:18, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> > wrote: > On Thu, Mar 22, 2018 at 4:32 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote: >>> On 12 March 2018 at 06:00, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> >>> wrote: >>> Thanks for the note. Here are rebased patches. >> >> Since I started to look at this patch, I can share few random notes (although >> it's not a complete review, I'm in the progress now), maybe it can be >> helpful. >> >> In `partition_range_bounds_merge` >> >> + if (!merged) >> + break; >> >> is a bit redundant I think, because every time `merged` set to false it >> followed by break. > > Yes, right now. May be I should turn it into Assert(merged); What do you > think?
Thank you for reply. Yes, that sounds good. But actually you also mentioned another topic that bothers me about this patch. Different parts of the algorithm implementation (at least for functions that build maps of matching partitions) are quite dependent on each other in terms of shared state. At first glance in `partition_range_bounds_merge` we have about a dozen of variables of different mutability level, that affect the control flow: outer_lb_index inner_lb_index merged merged_index overlap merged_lb merged_ub finished_outer finished_inner ub_cmpval lb_cmpval inner_has_default outer_has_default jointype It looks a bit too much for me, and would require commentaries like "if you changed the logic here, also take a look there". But I'm not saying that I have any specific suggestions how to change it (although I'll definitely try to do so, at least to get some better understanding of the underlying algorithm). >> >> I've noticed that in some places `IS_PARTITIONED_REL` was replaced >> >> - if (!IS_PARTITIONED_REL(joinrel)) >> + if (joinrel->part_scheme == NULL) >> >> but I'm not quite follow why? Is it because `boundinfo` is not available >> anymore at this point? If so, maybe it makes sense to update the commentary >> for >> this macro and mention to not use for joinrel. > > > This is done in try_partitionwise_join(). As explained in the comment > > * Get the list of matching partitions to be joined along with the > * partition bounds of the join relation. Because of the restrictions > * imposed by partition matching algorithm, not every pair of joining > * relations for this join will be able to use partition-wise join. But > all > * those pairs which can use partition-wise join will produce the same > * partition bounds for the join relation. > > boundinfo for the join relation is built in this function. So, we > don't have join relation's partitioning information fully set up yet. > So we can't use IS_PARTITIONED_REL() there. joinrel->part_scheme if > set tells that the joining relations have matching partition schemes > and thus the join relation can possibly use partition-wise join > technique. If it's not set, then we can't use partition-wise join. > > But IS_PARTITIONED_REL() is still useful at a number of other places, > where it's known to encounter a RelOptInfo whose partitioning > properties are fully setup. So, I don't think we should change the > macro or the comments above it. Just to make myself clear, I wanted to suggest not to change the commentary for `IS_PARTITIONED_REL` significantly, but just add a sentence that you need to check if given relation is fully set up. Also, few more random notes (mostly related to readability, since I found some parts of the patch hard to read, but of course it's arguable). ``` PartitionRangeBound outer_lb; PartitionRangeBound outer_ub; PartitionRangeBound inner_lb; PartitionRangeBound inner_ub; PartitionRangeBound *merged_lb = NULL; PartitionRangeBound *merged_ub = NULL; ``` Maybe it would be better to not repeat the type here? Let's say: ``` PartitionRangeBound outer_lb, outer_ub, ... ``` It's just too long and distracting. ``` partition_range_bounds_merge(int partnatts, FmgrInfo *partsupfuncs, Oid *partcollations, PartitionBoundInfo outer_bi, int outer_nparts, PartitionBoundInfo inner_bi, int inner_nparts, JoinType jointype, List **outer_parts, List **inner_parts) ``` >From what I see in `partition.c` there are a lot functions that accept `partnatts`, `partcollations` only to pass it down to, e.g. `partition_rbound_cmp`. What do you think about introducing a data structure to keep these arguments, and pass only an instance of this structure instead?