That makes sense. We should only check the requested columns, not every column in parquet file, to decide which parquet reader to use.
On Wed, Sep 13, 2017 at 4:17 PM, Damien Profeta <[email protected]> wrote: > Hi, > > I was looking at the code that read the parquet file and noticed there is > a switch 'isComplex' to choose if it is possible to use the new reader or > if we have to use the old one. > The switch is based on the columns of the files (complex type or > repetition level) but it doesn't care about the columns that have to > effectively read. > > If for the ongoing query, we only read simple column and not with > repetition level, couldn't we use the new reader? That would be a minor > optimization but it could be worth. > > Thanks > Damien > >
