Agree on the N phased approach. I have filed a JIRA for the enhancement: DRILL-3759. Regarding the simplification of the expression tree logic..did you mean the logic in FindPartitionConditions or the Interpreter ? Perhaps you can add comments in the JIRA with some explanation. I am in favor of simplification where possible.
On Wed, Sep 9, 2015 at 10:39 PM, Jacques Nadeau <[email protected]> wrote: > Makes sense. > > Is there we can do this with lazy materializations rather than writing > complex expression tree logic? I hate have no all this custom expression > tree manipulation logic. > > Also, it seems like this should be N phased rather than two phase where N > is the number of directories below the base path. > > Thoughts? > On Sep 9, 2015 10:54 AM, "Aman Sinha" <[email protected]> wrote: > > > Currently, partition pruning gets all file names in the table and applies > > the pruning. Suppose the files are spread out over several directories > and > > there is a filter on dirN, this is not efficient - both in terms of > > elapsed time and memory usage. This has been seen in a few use cases > > recently. > > > > We should ideally perform the pruning in 2 steps: first get the > top-level > > directory names only and apply the directory filter, then get the > filenames > > within that directory and apply remaining filters. > > > > I will create a JIRA for this enhancement but let me know your > thoughts... > > > > Aman > > >
