Currently all rules based on Calcite logical rels and Drill logical rels are put together and are fired together. As part of DRILL-3996, Jinfeng will break it down into different phases. I should be able to take advantage of this and move the directory based partition pruning to fire based on Calcite rels.

Thanks
Mehant

On 11/23/15 10:58 AM, Hanifi GUNES wrote:
The general idea of multi-phase pruning makes sense to me. I am wondering,
though, are we referring to introducing a new planning phase before the
logical or separating out the logic so as to make directory pruning kick
off ahead of column partitioning?

2015-11-23 10:33 GMT-08:00 Mehant Baid <[email protected]>:

As part of DRILL-3996 <https://issues.apache.org/jira/browse/DRILL-3996>
Jinfeng mentioned that he plans to move the directory based pruning rule
earlier than column based pruning. I want to expand on that a little,
provide the motivation and gather thoughts/ feedback.

Currently both the directory based pruning and the column based pruning is
fired in the same planning phase and are based on Drill logical rels. This
is not optimal in the case where data is organized in such a way that both
directory based pruning and column based pruning can be applied (when the
data is organized with a nested directory structure plus the individual
files contain partition columns). As part of creating the Drill logical
scan we read the footers of all the files involved. If the directory based
pruning rule is fired earlier (rule to fire based on calcite logical rels)
then we will be able to prune out unnecessary directories and save the work
of reading the footers of these files.

Thanks
Mehant



Reply via email to