Hi, I'm attempting to query a directory of parquet files that are partitioned on column A (int) and sorted on column B (also int). When I run a query such as SELECT * FROM mydirectory WHERE A = 123 AND B = 456, I can see that the physical query plan is using the criteria on A to choose the correct parquet file, but it is performing a ParquetGroupScan on ALL rows in that file despite the criteria on the sorted column B.
Based on my understanding of parquet, Drill should be using the page and/or column metadata to avoid scanning the entire file when filtering on a sorted column. However, there is no performance benefit when filtering on column B compared to any other non-sorted column. Is there something I can do to make Drill take advantage of the fact that my file is sorted? Thanks, Dan
