This is something which is not currently supported. The "parquet filter
pushdown" feature should be able to achieve this. Its still under
development.

- Rahul

On Fri, Jul 1, 2016 at 12:10 PM, Dan Wild <dwild...@gmail.com> wrote:

> Hi,
>
> I'm attempting to query a directory of parquet files that are partitioned
> on column A (int) and sorted on column B (also int).  When I run a query
> such as SELECT * FROM mydirectory WHERE A = 123 AND B = 456, I can see that
> the physical query plan is using the criteria on A to choose the correct
> parquet file, but it is performing a ParquetGroupScan on ALL rows in that
> file despite the criteria on the sorted column B.
>
> Based on my understanding of parquet, Drill should be using the page and/or
> column metadata to avoid scanning the entire file when filtering on a
> sorted column.  However, there is no performance benefit when filtering on
> column B compared to any other non-sorted column.
>
> Is there something I can do to make Drill take advantage of the fact that
> my file is sorted?
>
> Thanks,
> Dan
>

Reply via email to