arina-ielchiieva commented on a change in pull request #1552: DRILL-6865:
Query returns wrong result when filter pruning happens
URL: https://github.com/apache/drill/pull/1552#discussion_r235964950
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/AbstractParquetGroupScan.java
##########
@@ -310,13 +311,60 @@ public GroupScan applyFilter(LogicalExpression
filterExpr, UdfUtilities udfUtili
AbstractParquetGroupScan cloneGroupScan =
cloneWithFileSelection(qualifiedFilePath);
cloneGroupScan.rowGroupInfos = qualifiedRGs;
cloneGroupScan.parquetGroupScanStatistics.collect(cloneGroupScan.rowGroupInfos,
cloneGroupScan.parquetTableMetadata);
+ cloneGroupScan.matchAllRowGroups = matchAllRowGroupsLocal;
return cloneGroupScan;
} catch (IOException e) {
logger.warn("Could not apply filter prune due to Exception : {}", e);
return null;
}
}
+
+ /**
+ * Returns parquet filter predicate built from specified {@code filterExpr}.
+ *
+ * @param filterExpr filter expression to build
+ * @param udfUtilities udf utilities
+ * @param functionImplementationRegistry context to find drill function
holder
+ * @param optionManager option manager
+ * @param omitUnsupportedExprs whether expressions which cannot be
converted
+ * may be omitted from the resulting
expression
+ * @return parquet filter predicate
+ */
+ public ParquetFilterPredicate getParquetFilterPredicate(LogicalExpression
filterExpr,
Review comment:
Maybe filter creation was done before in a loop for the case when we could
not build filter form first row group but were able to build filter for the
second (for example, if they came from different files)?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services