On 2 March 2018 at 08:13, Robert Haas <robertmh...@gmail.com> wrote: > I don't like the comments at the top of partprune.c very much. It > seems strange to document individual functions here; those functions > can (and should) be documented in their individual header comments. > What we should have at the top of the file is a discussion of the > overall theory of operation of this module, something that currently > seems not to exist anywhere in the patch. I tried to figure that out > by looking at the new data structures the patch introduces: > PartitionPruneContext, PartScanKeyInfo, PartitionClauseInfo, and > PartClause. It looks like the general idea idea is that code that > wants to use these facilities does the following: > > Step 1. Generate a PartitionPruneContext. In this patch, this seems > to consist entirely of copying information from the RelOptInfo or its > PartitionScheme. > Step 2. Call generate_partition_clauses() to extract relevant clauses > from rel->baserestrictinfo and generate a PartClauseInfo. > Step 3. Call get_partitions_from_clauses() to generate a set of > unpruned partition indexes. Internally, that function will first > populate a PartScanKeyInfo from the PartClauseInfo by calling > extract_bounding_datums(). Then it calls get_partitions_for_keys() > which generates the set of unpruned partition indexes from the > PartitionPruneContext and PartScanKeyInfo.
Hi Robert, I feel I should step in here and answer this part as it was me who first came up with the idea of the context struct. I've typed up something below which is my first cut at what I'd have imagined the header comment of partprune.c should look like. Some parts are only revant after run-time pruning is also using this stuff. I've tried to highlight those areas, I'm not sure how much or if there should be any mention of that at all as part of this patch. Here goes: partprune.c Allows efficient identification of the minimal set of partitions which match a given set of clauses. Thus allowing useful things such as ignoring unneeded partitions which cannot possibly contain tuples matching the given set of clauses. This module breaks the process of determining the matching partitions into two distinct steps, each of which has its own function which is externally visible outside of this module. The reason for not performing everything in one step as down to the fact that there are times where we may wish to perform the 2nd step multiple times over. The steps could be thought of as a compilation step followed by an execution step. Step 1 (compilation): Pre-process the given list of clauses and attempt to match individual clauses up to a partition key. The end result of this process is a PartitionClauseInfo containing details of each clause found to match the partition key. This output is required as input for the 2nd step. Step 2 (execution): Step 2 outputs the minimal set of matching partitions based on the input from step 1. Internally, this step is broken down into smaller sub-steps, each of which is explained in detail in the comments in the corresponding function. Step 2 can be executed multiple times for its input values. The inputs to this step are not modified by the processing done within. It is expected that this step is executed multiple times in cases where the matching partitions must be determined during query execution. A subsequent evaluation of this step will be required whenever a parameter which was found in a clause matching the partition key changes its value. PartitionPruneContext: Each of the steps described above also requires an input of a PartitionPruneContext. This stores all of the remaining required inputs to each step. The context will vary slightly depending on the context in which the step is being called from; i.e the planner or executor. For example, during query planning, we're unable to determine the value of a Param found matching the partition key. When this step is called from the executor the PlanState can be set in the context which allows evaluation of these Params into Datum values. *** Only after run-time pruning *** The PartitionPruneContext is also required since many of the query planner node types are unavailable to the executor, which means that the source information used to populate the context will vary depending on if it's being called from the query planner or executor. *** Only after run-time pruning *** The context is also modified during step 1 to record all of the Param IDs which were found to match the partition key. ------------- Hopefully that also helps explain the intensions with the current code strucure. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services